You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/22 03:45:03 UTC
[GitHub] [spark] LuciferYang commented on pull request #37604: [DON'T MERGE] Try to replace all `json4s` with `Jackson`

LuciferYang commented on PR #37604:
URL: https://github.com/apache/spark/pull/37604#issuecomment-1221765844

   @JoshRosen I am not sure whether this draft is helpful for future work, but I hope it is useful to a certain extent.
   
   This draft pr does not focus on forward compatibility, I use Jackson `JsonNode` instead of Json4s `JValue`, includes method parameters type and return value type, also includes objects used to serialize and deserialize Json. The test code still using json4s to test compatibility and all test should passed.
   
   The change involves 5 modules `core`, `catalyst`, `sql`, `mllib`, `kafka`, except that `sql` is directly used for `Row.jsonValue` in `catalyst`, other modules are relatively independent.
   
   For exporting HTTP APIs, `JsonNode` is also used, and `jsonResponderToServlet` method in `JettyUtils` is adapted. Of course, we can also return a custom `JsonResult` object instead of relying on `JsonNode`
   
   A problem found in the rewriting process is that `Json4s` has `JNothing`, but Jackson does not, I used `MissingNode` in Jackson instead and made special processing, for example:
   
   ```scala
   if (!jsonNode.isMissingNode) {
     node.set[JsonNode](name, jsonNode)
   }
   ```
   
   mima checks the following forward incompatibilities:
   
   ```
    ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.deploy.DeployMessages#RequestExecutors.apply"),
    ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.sql.types.DataType#JSortedObject.unapplySeq"),
    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.expressions.MutableAggregationBuffer.jsonValue"),
    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.SafeJsonSerializer.safeMapToJValue"),
    ProblemFilters.exclude[DirectMissingMethodProblem]("org.apache.spark.sql.streaming.SafeJsonSerializer.safeDoubleToJValue"),
    ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.ml.param.FloatParam.jValueDecode"),
    ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.param.FloatParam.jValueEncode"),
    ProblemFilters.exclude[IncompatibleMethTypeProblem]("org.apache.spark.mllib.tree.model.TreeEnsembleModel#SaveLoadV1_0.readMetadata")
   ```
   
   This is similar to the problem described in SPARK-39658, `JValue` is directly used in the API.
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org