You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2020/09/18 18:22:38 UTC

[GitHub] [kafka] shaikzakiriitm opened a new pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

shaikzakiriitm opened a new pull request #9306:
URL: https://github.com/apache/kafka/pull/9306


   …NG JsonNodeType in JsonConverter.
   
   From v2.10.0 onwards, in jackson lib, ObjectMapper.readTree(input) started to return JsonNode of type MISSING for empty input, as mentioned in the issue: https://github.com/FasterXML/jackson-databind/issues/2211.
   
   This caused to throw a `DataException["Unknown schema type: null"]` when an empty message key was parsed to be converted to Connect format via JsonConverter. The `schemaType` for `MISSING` type of JsonNode was returned as null, which resulted in this exception. 
    
   As part of this PR, made changes in JsonParser to incorporate the Jackson's ObjectMapper treatment of empty input from v2.10.0 onwards. Treating MISSING JsonNodeType in a similar fashion as that of NULL JsonNodeType.
   Added a unit test for this.
   
   Related Jira ticket captures further details: https://issues.apache.org/jira/browse/KAFKA-10477 
   All unit tests passed locally. 
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
rhauch commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-702447474


   @shaikzakiriitm I added another test to verify that having a null within a JSON field is not a problem. It builds for me locally, and the new test passes when I've tested it against 2.3.1, 2.2.2, 2.4.0, 2.4.1, 2.5.0, 2.5.1, 2.6.0. However, even though it doesn't fail with these versions like your other test, it is still useful to at least verify that such behavior is unchanged.
   
   If we get a green build, I'll merge as discussed above.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
rhauch commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-700182723


   @shaikzakiriitm, you mentioned above:
   > From v2.10.0 onwards, in jackson lib, ObjectMapper.readTree(input) started to return JsonNode of type MISSING for empty input, as mentioned in the issue: [FasterXML/jackson-databind#2211](https://github.com/FasterXML/jackson-databind/issues/2211).
   
   However, the [FasterXML/jackson-databind#2211](https://github.com/FasterXML/jackson-databind/issues/2211) you mention above talks about this behavior changing in 2.9 relative to 2.8. This seems to not align with the your assertion in the [KAFKA-10477](https://issues.apache.org/jira/browse/KAFKA-10477) that:
   > Things were working fine when the dependency on jackson lib was of version  v2.9.10.3
   
   Can you clarify?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] shaikzakiriitm commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
shaikzakiriitm commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-702222463


   > @shaikzakiriitm, you mentioned above:
   > 
   > > From v2.10.0 onwards, in jackson lib, ObjectMapper.readTree(input) started to return JsonNode of type MISSING for empty input, as mentioned in the issue: [FasterXML/jackson-databind#2211](https://github.com/FasterXML/jackson-databind/issues/2211).
   > 
   > However, the [FasterXML/jackson-databind#2211](https://github.com/FasterXML/jackson-databind/issues/2211) you mention above talks about this behavior changing in 2.9 relative to 2.8. This seems to not align with the your assertion in the [KAFKA-10477](https://issues.apache.org/jira/browse/KAFKA-10477) that:
   > 
   > > Things were working fine when the dependency on jackson lib was of version  v2.9.10.3
   > 
   > Can you clarify?
   
   Hi, I have shared more details in the comment https://github.com/apache/kafka/pull/9306#issuecomment-702221113 . Kindly let me know if any further details are needed. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] shaikzakiriitm commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
shaikzakiriitm commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-696107218


   Failures are in ConnectionQuotasTest, and ConsumerBounceTest test classes, and appear unrelated to the change in this PR. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] shaikzakiriitm commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
shaikzakiriitm commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-702221113


   Following are the version dependencies of Apache Kafka (ak) on com.fasterxml.jackson.core:jackson-databind (j)
   
   - ak v2.1.0 -> j v2.9.7
   - ak v2.1.1 -> j v2.9.8
   - ak v2.2.0 -> j v2.9.8
   - ak v2.2.1 -> j v2.9.8
   - ak v2.2.2 -> j v2.10.0
   - ak v2.3.0 -> j v2.9.9 [**Looks like an outlier, where ak version increased with a decrease in jackson-databind version**]
   - ak v2.3.1 -> j v2.10.0
   - ak v2.4.0 -> j v2.10.0
   - ak v2.4.1 -> j v2.10.0
   - ak v2.5.1 -> j v2.10.2
   - ak v2.6 -> j v2.10.2
   
   The behavior of the Jackson  ObjectMapper#readTree() method when called with a blank string/empty-input changed as follows (based on the information from FasterXML/jackson-databind#2211) :
   - For Jackson v2.x-2.8.x, return NullNode
   - For Jackson v2.9.x return null
   - For Jackson v2.10.0  and beyond return MissingNode
   
   However, in all AK versions JsonConverter throws an error when it gets a MissingNode. This is only a problem for AK versions (v2.2.2, v2.3.1, v2.4.0, v2.4.1, v2.5.1) that were released with Jackson 2.10.0 or later:
   
   - All versions of AK 2.1.x-2.2.1, ObjectMapper.readTree() returns null on empty input, which AK populates in an envelope, and the payload in the envelope which is null is treated as json node of JsonNodeType NULL. And this NULL case is handled in `JsonConverter#convertToConnect()`.
   - AK version 2.2.2, ObjectMapper.readTree() returns MissingNode on empty input, which AK populates as payload in envelope; this payload is a jsonNode of JsonNodeType MISSING. And `JsonConverter#convertToConnect()` considers `schemaType` to be null for MISSING case, which results in `DataException`.
   - AK 2.3.0, (uses jackson-databind version 2.9.9), ObjectMapper.readTree() returns null on empty input, which AK populates as payload in envelope; this payload when retrieved in to be used in JsonConverter.convertToConnect(), is treated as NullNode, of JsonNodeType NULL. (as in the case of AK 2.1.x-2.2.1).
   - AK 2.3.1-2.6, ObjectMapper.readTree() returns MissingNode on empty input, which AK populates as payload in envelope; this payload is a jsonNode of type MISSING which causes the DataException.
   
   So, this PR aims to resolve the DataException that gets thrown when jackson-databind's `ObjectMapper.readTree()` returns MissingNode upon encountering an empty input, so that JsonConverter treats MissingNode in a similar behavior as that of null. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
rhauch commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-702443976


   Thanks for putting together this comprehensive list of AK versions, Jackson versions, and behavior.
   
   > * ak v2.3.0 -> j v2.9.9 [Looks like an outlier, where ak version increased with a decrease in jackson-databind version]
   I don't think this was an outlier. If you sort them by release dates, things look a bit more logical and you can see how the Jackson versions only continually increase over time:
   
   ak v2.1.0 (2018-11-20)-> j v2.9.7
   ak v2.1.1 (2019-02-15) -> j v2.9.8
   ak v2.2.0 (2019-03-22) -> j v2.9.8
   ak v2.2.1 (2019-06-01) -> j v2.9.8
   ak v2.3.0 (2019-06-24) -> j v2.9.9
   ak v2.3.1 (2019-10-24) -> j v2.10.0
   ak v2.2.2 (2019-12-01) -> j v2.10.0
   ak v2.4.0 (2019-12-13) -> j v2.10.0
   ak v2.4.1 (2020-03-10)-> j v2.10.0
   ak v2.5.0 (2020-04-14) -> j v2.10.2
   ak v2.6.0 (2020-08-03) -> j v2.10.2
   ak v2.5.1 (2020-08-10) -> j v2.10.2
   
   IOW, the following **versions** are all affected since they use Jackson 2.10.x:
   * 2.3.1
   * 2.2.2
   * 2.4.0
   * 2.4.1
   * 2.5.0
   * 2.5.1
   * 2.6.0
   
   and thus the following **branches** are affected since they now use Jackson 2.10.x:
   * `2.2`
   * `2.3`
   * `2.4`
   * `2.5`
   * `2.6`
   
   I've also gone through a number of versions of the https://github.com/FasterXML/jackson-databind code, looking for how MissingNode is used. It's obvious that MissingNode is used a lot more prevalently in 2.10.0. But the documentation for `MissingNode` is as follows:
   
   >  In most respects this placeholder node will act as {@link NullNode};
   > for example, for purposes of value conversions, value is considered
   > to be null and represented as value zero when used for numeric
   > conversions.
   
   So I think it's pretty clear that we should treat `JsonNodeType.MISSING` as null. This is what this PR tries to do, and I'm now less concerned about unexpected behavioral changes after having read and verified this analysis.
   
   Once this PR is merged to `trunk`, this fix will need to be backported to the following **branches** that are all affected because they use Jackson 2.10.x:
   * `2.2`
   * `2.3`
   * `2.4`
   * `2.5`
   * `2.6`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] shaikzakiriitm commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
shaikzakiriitm commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-696107218


   Failures are in ConnectionQuotasTest, and ConsumerBounceTest test classes, and appear unrelated to the change in this PR. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] shaikzakiriitm commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
shaikzakiriitm commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-695020619


   @kkonstantine @rhauch Kindly take a look at the changes. Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch commented on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
rhauch commented on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-702745660


   No green jobs, but all of the Connect and MirrorMaker2 tests in all of the jobs passed. All of the failures in the tests were (flaky) client, broker, or streams tests.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] shaikzakiriitm edited a comment on pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
shaikzakiriitm edited a comment on pull request #9306:
URL: https://github.com/apache/kafka/pull/9306#issuecomment-702221113


   Following are the version dependencies of Apache Kafka (ak) on com.fasterxml.jackson.core:jackson-databind (j)
   
   - ak v2.1.0 -> j v2.9.7
   - ak v2.1.1 -> j v2.9.8
   - ak v2.2.0 -> j v2.9.8
   - ak v2.2.1 -> j v2.9.8
   - ak v2.2.2 -> j v2.10.0
   - ak v2.3.0 -> j v2.9.9 [**Looks like an outlier, where ak version increased with a decrease in jackson-databind version**]
   - ak v2.3.1 -> j v2.10.0
   - ak v2.4.0 -> j v2.10.0
   - ak v2.4.1 -> j v2.10.0
   - ak v2.5.1 -> j v2.10.2
   - ak v2.6 -> j v2.10.2
   
   The behavior of the Jackson  ObjectMapper#readTree() method when called with a blank string/empty-input changed as follows (based on the information from FasterXML/jackson-databind#2211) :
   - For Jackson v2.x-2.8.x, return NullNode
   - For Jackson v2.9.x return null
   - For Jackson v2.10.0  and beyond return MissingNode
   
   However, in all AK versions JsonConverter throws an error when it gets a MissingNode. This is only a problem for AK versions (v2.2.2, v2.3.1, v2.4.0, v2.4.1, v2.5.1, v2.6) that were released with Jackson 2.10.0 or later:
   
   - All versions of AK 2.1.x-2.2.1, ObjectMapper.readTree() returns null on empty input, which AK populates in an envelope, and the payload in the envelope which is null is treated as json node of JsonNodeType NULL. And this NULL case is handled in `JsonConverter#convertToConnect()`.
   - AK version 2.2.2, ObjectMapper.readTree() returns MissingNode on empty input, which AK populates as payload in envelope; this payload is a jsonNode of JsonNodeType MISSING. And `JsonConverter#convertToConnect()` considers `schemaType` to be null for MISSING case, which results in `DataException`.
   - AK 2.3.0, (uses jackson-databind version 2.9.9), ObjectMapper.readTree() returns null on empty input, which AK populates as payload in envelope; this payload when retrieved in to be used in JsonConverter.convertToConnect(), is treated as NullNode, of JsonNodeType NULL. (as in the case of AK 2.1.x-2.2.1).
   - AK 2.3.1-2.6, ObjectMapper.readTree() returns MissingNode on empty input, which AK populates as payload in envelope; this payload is a jsonNode of type MISSING which causes the DataException.
   
   So, this PR aims to resolve the DataException that gets thrown when jackson-databind's `ObjectMapper.readTree()` returns MissingNode upon encountering an empty input, so that JsonConverter treats MissingNode in a similar behavior as that of null. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] rhauch merged pull request #9306: KAFKA-10477: Enabling the same behavior of NULL JsonNodeType to MISSI…

Posted by GitBox <gi...@apache.org>.
rhauch merged pull request #9306:
URL: https://github.com/apache/kafka/pull/9306


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org