You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "echauchot (via GitHub)" <gi...@apache.org> on 2023/04/14 14:29:34 UTC

[GitHub] [flink-web] echauchot opened a new pull request, #643: Add blog article: Howto test a batch source with the new Source framework

echauchot opened a new pull request, #643:
URL: https://github.com/apache/flink-web/pull/643

   Last article of the series about testing the created batch source in the previous article.
   
   R:@zentol. Feel free to delegate as you have already the previous 2 articles to review.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192029322


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192157171


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment

Review Comment:
   ok fair enough, in that case I'll also remove "runtime" from the "Flink runtime environment" title for coherence.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192004345


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that

Review Comment:
   :+1: 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192083594


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192021200


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)

Review Comment:
   :+1: 



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on PR #643:
URL: https://github.com/apache/flink-web/pull/643#issuecomment-1545544022

   > > I took a quick look over it. I feel especially for the IT case it would be nice to first provide an overview over the differnet parts that make up a test "There's a test framework; We need a backend the source can read from, a test context for Flinks testing framework to interact with the backend, ..." and then drill down into the details as you already did.
   > > Right now it's very "I have to do this and this and this" but you don't really know where you're heading.
   > 
   > Totally agree, I'll do something similar to what I did on the source creation blog: an architecture paragraph
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] zentol commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "zentol (via GitHub)" <gi...@apache.org>.

zentol commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1189562453


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same

Review Comment:
   ```suggestion
   we create an object, serialize it using the serializer and then deserialize it using the same
   ```



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)

Review Comment:
   ```suggestion
   [Example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
   ```



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde

Review Comment:
   ```suggestion
   for Split and SplitEnumeratorState. We should now test them in unit tests. To test serde
   ```
   using "usual" in a howto seems odd



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();
+`
+
+BackendTestEnvironment
+implements [TestResource](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html)
+. This environment is scoped to the test suite so it is where we setup the backend runtime and
+shared resources (session, tablespace, etc...) by
+implementing [startup()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#startUp--)
+and [tearDown()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#tearDown--)
+methods. For
+that we advice the use of [testContainers](https://www.testcontainers.org/) that relies on docker
+images to provide a real backend
+instance (not a mock) that is representative for integration tests. Several backends are supported
+out of the box by testContainers. We need to configure test containers that way:
+
+* Redirect container output (error and standard output) to Flink logs
+* Set the different timeouts to cope with CI server load
+* Set retrial mechanisms for connection, initialization requests etc... for the same reason
+
+### Checkpointing semantics
+
+In big data execution engines, there are 2 levels of guaranty regarding source and sinks:

Review Comment:
   ```suggestion
   In big data execution engines, there are 2 levels of guarantee regarding source and sinks:
   ```



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment

Review Comment:
   Maybe some high-level intro line what this is about could be helpful.
   
   Something like "To test the connector we need a backend to run the connector against. Something something need to implement something to integrate this into junit5; maybe provided by library."



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();
+`
+
+BackendTestEnvironment
+implements [TestResource](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html)
+. This environment is scoped to the test suite so it is where we setup the backend runtime and
+shared resources (session, tablespace, etc...) by
+implementing [startup()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#startUp--)
+and [tearDown()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#tearDown--)
+methods. For
+that we advice the use of [testContainers](https://www.testcontainers.org/) that relies on docker
+images to provide a real backend
+instance (not a mock) that is representative for integration tests. Several backends are supported
+out of the box by testContainers. We need to configure test containers that way:
+
+* Redirect container output (error and standard output) to Flink logs
+* Set the different timeouts to cope with CI server load
+* Set retrial mechanisms for connection, initialization requests etc... for the same reason
+
+### Checkpointing semantics
+
+In big data execution engines, there are 2 levels of guaranty regarding source and sinks:
+
+* At least once: upon failure and recovery, some records may be reflected multiple times but none
+  will
+  be lost
+* Exactly once: upon failure and recovery, every record will be reflected exactly once
+  By the following code we verify that the source supports exactly once semantics:
+
+`@TestSemantics
+CheckpointingMode[] semantics = new CheckpointingMode[] {CheckpointingMode.EXACTLY_ONCE};
+`
+
+That being said, we could encounter a problem while running the tests : the default assertions in
+the Flink source test framework assume that the data is read in the same order it was written. This
+is untrue for most big data backends where ordering is usually not deterministic. To support
+unordered checks and still use all the framework provided tests, we need to override
+[SourceTestSuiteBase#checkResultWithSemantic](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html#checkResultWithSemantic-org.apache.flink.util.CloseableIterator-java.util.List-org.apache.flink.streaming.api.CheckpointingMode-java.lang.Integer-)
+in out ITCase:
+
+`@Override
+protected void checkResultWithSemantic(
+CloseableIterator<Pojo> resultIterator,
+List<List<Pojo>> testData,
+CheckpointingMode semantic,
+Integer limit) {
+if (limit != null) {
+Runnable runnable =
+() ->
+CollectIteratorAssertions.assertUnordered(resultIterator)
+.withNumRecordsLimit(limit)
+.matchesRecordsFromSource(testData, semantic);
+
+        assertThat(runAsync(runnable)).succeedsWithin(DEFAULT_COLLECT_DATA_TIMEOUT);
+    } else {
+        CollectIteratorAssertions.assertUnordered(resultIterator)
+                .matchesRecordsFromSource(testData, semantic);
+    }
+
+}`
+
+This is simply a copy-paste of the parent method where _CollectIteratorAssertions.assertOrdered()_
+is
+replaced by _CollectIteratorAssertions.assertUnordered()_.
+
+### Test context
+
+[example Cassandra TestContext](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestContext.java)
+
+The test context is scoped to the test case. So it is where we do things like creating test table,
+creating the source or writing test data.

Review Comment:
   ```suggestion
   The test context provides Flink with means to interact with the backend, like inserting test data/tables or constructing sources/sink.
   ```



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment

Review Comment:
   ```suggestion
   ### Backend environment
   ```
   It is never referenced as "backend **runtime** environment" in subsequent sections



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();

Review Comment:
   ```suggestion
   MyBackendTestEnvironment backendTestEnvironment = new MyBackendTestEnvironment();
   ```
   Maybe makes it a bit clearer that this is something implemented by the user.



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();
+`
+
+BackendTestEnvironment
+implements [TestResource](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html)
+. This environment is scoped to the test suite so it is where we setup the backend runtime and
+shared resources (session, tablespace, etc...) by
+implementing [startup()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#startUp--)
+and [tearDown()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#tearDown--)
+methods. For
+that we advice the use of [testContainers](https://www.testcontainers.org/) that relies on docker
+images to provide a real backend
+instance (not a mock) that is representative for integration tests. Several backends are supported
+out of the box by testContainers. We need to configure test containers that way:
+
+* Redirect container output (error and standard output) to Flink logs
+* Set the different timeouts to cope with CI server load
+* Set retrial mechanisms for connection, initialization requests etc... for the same reason
+
+### Checkpointing semantics
+
+In big data execution engines, there are 2 levels of guaranty regarding source and sinks:
+
+* At least once: upon failure and recovery, some records may be reflected multiple times but none
+  will
+  be lost
+* Exactly once: upon failure and recovery, every record will be reflected exactly once
+  By the following code we verify that the source supports exactly once semantics:

Review Comment:
   ```suggestion
   
   By the following code we verify that the source supports exactly once semantics:
   ```



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that

Review Comment:
   ```suggestion
   we create an *ITCase named class that
   ```
   The name only really matters if the test is added to Flink. I think we shouldn't assume that because users might have their own sources they want to test that won't be contributed to Flink.



##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();
+`
+
+BackendTestEnvironment
+implements [TestResource](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html)
+. This environment is scoped to the test suite so it is where we setup the backend runtime and
+shared resources (session, tablespace, etc...) by
+implementing [startup()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#startUp--)
+and [tearDown()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#tearDown--)
+methods. For
+that we advice the use of [testContainers](https://www.testcontainers.org/) that relies on docker
+images to provide a real backend
+instance (not a mock) that is representative for integration tests. Several backends are supported
+out of the box by testContainers. We need to configure test containers that way:
+
+* Redirect container output (error and standard output) to Flink logs
+* Set the different timeouts to cope with CI server load
+* Set retrial mechanisms for connection, initialization requests etc... for the same reason
+
+### Checkpointing semantics
+
+In big data execution engines, there are 2 levels of guaranty regarding source and sinks:
+
+* At least once: upon failure and recovery, some records may be reflected multiple times but none
+  will
+  be lost
+* Exactly once: upon failure and recovery, every record will be reflected exactly once
+  By the following code we verify that the source supports exactly once semantics:
+
+`@TestSemantics
+CheckpointingMode[] semantics = new CheckpointingMode[] {CheckpointingMode.EXACTLY_ONCE};
+`
+
+That being said, we could encounter a problem while running the tests : the default assertions in
+the Flink source test framework assume that the data is read in the same order it was written. This
+is untrue for most big data backends where ordering is usually not deterministic. To support
+unordered checks and still use all the framework provided tests, we need to override
+[SourceTestSuiteBase#checkResultWithSemantic](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html#checkResultWithSemantic-org.apache.flink.util.CloseableIterator-java.util.List-org.apache.flink.streaming.api.CheckpointingMode-java.lang.Integer-)
+in out ITCase:
+
+`@Override
+protected void checkResultWithSemantic(
+CloseableIterator<Pojo> resultIterator,
+List<List<Pojo>> testData,
+CheckpointingMode semantic,
+Integer limit) {
+if (limit != null) {
+Runnable runnable =
+() ->
+CollectIteratorAssertions.assertUnordered(resultIterator)
+.withNumRecordsLimit(limit)
+.matchesRecordsFromSource(testData, semantic);
+
+        assertThat(runAsync(runnable)).succeedsWithin(DEFAULT_COLLECT_DATA_TIMEOUT);
+    } else {
+        CollectIteratorAssertions.assertUnordered(resultIterator)
+                .matchesRecordsFromSource(testData, semantic);
+    }
+
+}`
+
+This is simply a copy-paste of the parent method where _CollectIteratorAssertions.assertOrdered()_

Review Comment:
   ```suggestion
   This is a copy-paste of the parent method where _CollectIteratorAssertions.assertOrdered()_
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192167315


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();
+`
+
+BackendTestEnvironment
+implements [TestResource](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html)
+. This environment is scoped to the test suite so it is where we setup the backend runtime and
+shared resources (session, tablespace, etc...) by
+implementing [startup()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#startUp--)
+and [tearDown()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#tearDown--)
+methods. For
+that we advice the use of [testContainers](https://www.testcontainers.org/) that relies on docker
+images to provide a real backend
+instance (not a mock) that is representative for integration tests. Several backends are supported
+out of the box by testContainers. We need to configure test containers that way:
+
+* Redirect container output (error and standard output) to Flink logs
+* Set the different timeouts to cope with CI server load
+* Set retrial mechanisms for connection, initialization requests etc... for the same reason
+
+### Checkpointing semantics
+
+In big data execution engines, there are 2 levels of guaranty regarding source and sinks:
+
+* At least once: upon failure and recovery, some records may be reflected multiple times but none
+  will
+  be lost
+* Exactly once: upon failure and recovery, every record will be reflected exactly once
+  By the following code we verify that the source supports exactly once semantics:
+
+`@TestSemantics
+CheckpointingMode[] semantics = new CheckpointingMode[] {CheckpointingMode.EXACTLY_ONCE};
+`
+
+That being said, we could encounter a problem while running the tests : the default assertions in
+the Flink source test framework assume that the data is read in the same order it was written. This
+is untrue for most big data backends where ordering is usually not deterministic. To support
+unordered checks and still use all the framework provided tests, we need to override
+[SourceTestSuiteBase#checkResultWithSemantic](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html#checkResultWithSemantic-org.apache.flink.util.CloseableIterator-java.util.List-org.apache.flink.streaming.api.CheckpointingMode-java.lang.Integer-)
+in out ITCase:
+
+`@Override
+protected void checkResultWithSemantic(
+CloseableIterator<Pojo> resultIterator,
+List<List<Pojo>> testData,
+CheckpointingMode semantic,
+Integer limit) {
+if (limit != null) {
+Runnable runnable =
+() ->
+CollectIteratorAssertions.assertUnordered(resultIterator)
+.withNumRecordsLimit(limit)
+.matchesRecordsFromSource(testData, semantic);
+
+        assertThat(runAsync(runnable)).succeedsWithin(DEFAULT_COLLECT_DATA_TIMEOUT);
+    } else {
+        CollectIteratorAssertions.assertUnordered(resultIterator)
+                .matchesRecordsFromSource(testData, semantic);
+    }
+
+}`
+
+This is simply a copy-paste of the parent method where _CollectIteratorAssertions.assertOrdered()_
+is
+replaced by _CollectIteratorAssertions.assertUnordered()_.
+
+### Test context
+
+[example Cassandra TestContext](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestContext.java)
+
+The test context is scoped to the test case. So it is where we do things like creating test table,
+creating the source or writing test data.

Review Comment:
   Agree with the rephrasing but I'd like also to keep the important notion of test case scope of the TestContext (compared to test suite scope of the backend environment)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on PR #643:
URL: https://github.com/apache/flink-web/pull/643#issuecomment-1545283430

   > I took a quick look over it. I feel especially for the IT case it would be nice to first provide an overview over the differnet parts that make up a test "There's a test framework; We need a backend the source can read from, a test context for Flinks testing framework to interact with the backend, ..." and then drill down into the details as you already did.
   > 
   > Right now it's very "I have to do this and this and this" but you don't really know where you're heading.
   
   Totally agree, thanks for the suggestion


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192157171


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment

Review Comment:
   ok fair enough, in that case I'll also remove "runtime" from "Flink runtime environment" title for coherence.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192162301


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192171185


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();
+`
+
+BackendTestEnvironment
+implements [TestResource](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html)
+. This environment is scoped to the test suite so it is where we setup the backend runtime and
+shared resources (session, tablespace, etc...) by
+implementing [startup()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#startUp--)
+and [tearDown()](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/TestResource.html#tearDown--)
+methods. For
+that we advice the use of [testContainers](https://www.testcontainers.org/) that relies on docker
+images to provide a real backend
+instance (not a mock) that is representative for integration tests. Several backends are supported
+out of the box by testContainers. We need to configure test containers that way:
+
+* Redirect container output (error and standard output) to Flink logs
+* Set the different timeouts to cope with CI server load
+* Set retrial mechanisms for connection, initialization requests etc... for the same reason
+
+### Checkpointing semantics
+
+In big data execution engines, there are 2 levels of guaranty regarding source and sinks:
+
+* At least once: upon failure and recovery, some records may be reflected multiple times but none
+  will
+  be lost
+* Exactly once: upon failure and recovery, every record will be reflected exactly once
+  By the following code we verify that the source supports exactly once semantics:
+
+`@TestSemantics
+CheckpointingMode[] semantics = new CheckpointingMode[] {CheckpointingMode.EXACTLY_ONCE};
+`
+
+That being said, we could encounter a problem while running the tests : the default assertions in
+the Flink source test framework assume that the data is read in the same order it was written. This
+is untrue for most big data backends where ordering is usually not deterministic. To support
+unordered checks and still use all the framework provided tests, we need to override
+[SourceTestSuiteBase#checkResultWithSemantic](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html#checkResultWithSemantic-org.apache.flink.util.CloseableIterator-java.util.List-org.apache.flink.streaming.api.CheckpointingMode-java.lang.Integer-)
+in out ITCase:
+
+`@Override
+protected void checkResultWithSemantic(
+CloseableIterator<Pojo> resultIterator,
+List<List<Pojo>> testData,
+CheckpointingMode semantic,
+Integer limit) {
+if (limit != null) {
+Runnable runnable =
+() ->
+CollectIteratorAssertions.assertUnordered(resultIterator)
+.withNumRecordsLimit(limit)
+.matchesRecordsFromSource(testData, semantic);
+
+        assertThat(runAsync(runnable)).succeedsWithin(DEFAULT_COLLECT_DATA_TIMEOUT);
+    } else {
+        CollectIteratorAssertions.assertUnordered(resultIterator)
+                .matchesRecordsFromSource(testData, semantic);
+    }
+
+}`
+
+This is simply a copy-paste of the parent method where _CollectIteratorAssertions.assertOrdered()_
+is
+replaced by _CollectIteratorAssertions.assertUnordered()_.
+
+### Test context
+
+[example Cassandra TestContext](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestContext.java)
+
+The test context is scoped to the test case. So it is where we do things like creating test table,
+creating the source or writing test data.

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192030443


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde
+we just create an object, serialize it using the serializer and then deserialize it using the same
+serializer and finally assert on the equality of the two objects. Thus, hascode() and equals() need
+to be implemented for the serialized objects.
+
+### Other unit tests
+
+Of course, we also need to unit test low level processing such as query building for example or any
+processing that does not require a running backend.
+
+## Integration testing the source
+
+[example Cassandra SourceITCase
+](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraSourceITCase.java)
+
+For tests that require a running backend, Flink provides a JUnit5 source test framework. To use it
+we create an *ITCase named class that
+extends [SourceTestSuiteBase](https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/connector/testframe/testsuites/SourceTestSuiteBase.html)
+. This test suite provides all
+the necessary tests already (single split, multiple splits, idle reader, etc...). It is targeted for
+batch and streaming sources, so for our batch source case here, the tests below need to be disabled
+as they are targeted for streaming sources. They can be disabled by overriding them in the ITCase
+and annotating them with @Disabled:
+
+* testSourceMetrics
+* testSavepoint
+* testScaleUp
+* testScaleDown
+* testTaskManagerFailure
+
+Of course we can add our own integration tests cases for example tests on limits, tests on low level
+splitting or any test that requires a running backend. But for most cases we only need to provide
+Flink test environment classes to configure the ITCase:
+
+### Flink runtime environment
+
+We add this annotated field to our ITCase and we're done
+
+`@TestEnv
+MiniClusterTestEnvironment flinkTestEnvironment = new MiniClusterTestEnvironment();
+`
+
+### Backend runtime environment
+
+[example Cassandra TestEnvironment](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/test/java/org/apache/flink/connector/cassandra/source/CassandraTestEnvironment.java)
+
+We add this annotated field to our ITCase
+
+`@TestExternalSystem
+BackendTestEnvironment backendTestEnvironment = new BackendTestEnvironment();

Review Comment:
   agree



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on PR #643:
URL: https://github.com/apache/flink-web/pull/643#issuecomment-1534918324

   @MartijnVisser @zentol did almost all my last reviews. Would you have time to review this small article as you know the source framework ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot merged pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot merged PR #643:
URL: https://github.com/apache/flink-web/pull/643


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on a diff in pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on code in PR #643:
URL: https://github.com/apache/flink-web/pull/643#discussion_r1192003814


##########
docs/content/posts/howto-test-batch-source.md:
##########
@@ -0,0 +1,202 @@
+---
+title:  "Howto test a batch source with the new Source framework"
+date: "2023-04-14T08:00:00.000Z"
+authors:
+- echauchot:
+  name: "Etienne Chauchot"
+  twitter: "echauchot"
+
+---
+
+## Introduction
+
+The Flink community has
+designed [a new Source framework](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/dev/datastream/sources/)
+based
+on [FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)
+lately. This article is the
+continuation of
+the [howto create a batch source with the new Source framework article](https://flink.apache.org/2023/04/14/howto-create-batch-source/)
+. Now it is
+time to test the created source ! As the previous article, this one was built while implementing the
+[Flink batch source](https://github.com/apache/flink-connector-cassandra/commit/72e3bef1fb9ee6042955b5e9871a9f70a8837cca)
+for [Cassandra](https://cassandra.apache.org/_/index.html).
+
+## Unit testing the source
+
+### Testing the serializers
+
+[example Cassandra SplitSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/split/CassandraSplitSerializer.java)
+and [SplitEnumeratorStateSerializer](https://github.com/apache/flink-connector-cassandra/blob/d92dc8d891098a9ca6a7de6062b4630079beaaef/flink-connector-cassandra/src/main/java/org/apache/flink/connector/cassandra/source/enumerator/CassandraEnumeratorStateSerializer.java)
+
+In the previous article, we
+created [serializers](https://flink.apache.org/2023/04/14/howto-create-batch-source/#serializers)
+for Split and SplitEnumeratorState. We should now test them in unit tests. As usual, to test serde

Review Comment:
   agree



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [flink-web] echauchot commented on pull request #643: Add blog article: Howto test a batch source with the new Source framework

Posted by "echauchot (via GitHub)" <gi...@apache.org>.

echauchot commented on PR #643:
URL: https://github.com/apache/flink-web/pull/643#issuecomment-1545554335

   @zentol Thanks for your review (once again). I addressed all your comments PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org