You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/11 07:17:43 UTC

[GitHub] [spark] LuciferYang opened a new pull request, #37480: [SPARK-40046][SQL] Use Jackson instead of json4s to serialize `RocksDBMetrics`

LuciferYang opened a new pull request, #37480:
URL: https://github.com/apache/spark/pull/37480

   ### What changes were proposed in this pull request?
   This pr change to use Jackson instead of json4s to serialize `RocksDBMetrics`.
   
   
   ### Why are the changes needed?
   Weaken the dependence on `json4s`.
   
   
   ### Does this PR introduce _any_ user-facing change?
   No. In fact, `RocksDBMetrics.json` is only used to print logs now.
   
   
   ### How was this patch tested?
   
   - Pass GitHub Actions
   - Manually compare json results:
   
   **Before**
   ```
   13:22:48.839 INFO org.apache.spark.sql.execution.streaming.state.RocksDB [Thread-1]: Committed 1, stats = {"numCommittedKeys":1,"numUncommittedKeys":1,"totalMemUsageBytes":3976,"writeBatchMemUsageBytes":17,"totalSSTFilesBytes":1146,"nativeOpsHistograms":{"get":{"sum":11,"avg":5.5,"stddev":0.5,"median":5.0,"p95":5.9,"p99":5.98,"count":2},"put":{"sum":17932,"avg":17932.0,"stddev":0.0,"median":17932.0,"p95":17932.0,"p99":17932.0,"count":1},"compaction":{"sum":0,"avg":0.0,"stddev":0.0,"median":0.0,"p95":0.0,"p99":0.0,"count":0}},"lastCommitLatencyMs":{"fileSync":582,"writeBatch":18,"flush":58,"pause":0,"checkpoint":64,"compact":0},"filesCopied":1,"bytesCopied":1146,"filesReused":0,"zipFileBytesUncompressed":6973,"nativeOpsMetrics":{"writerStallDuration":0,"totalBytesReadThroughIterator":0,"readBlockCacheHitCount":0,"totalBytesWrittenByCompaction":0,"readBlockCacheMissCount":0,"totalBytesReadByCompaction":0,"totalBytesWritten":17,"totalBytesRead":0}}
   ```
   **After**
   ```
   13:18:45.210 INFO org.apache.spark.sql.execution.streaming.state.RocksDB [Thread-1]: Committed 1, stats = {"numCommittedKeys":1,"numUncommittedKeys":1,"totalMemUsageBytes":3976,"writeBatchMemUsageBytes":17,"totalSSTFilesBytes":1146,"nativeOpsHistograms":{"get":{"sum":7,"avg":3.5,"stddev":0.5,"median":3.0,"p95":3.9,"p99":3.98,"count":2},"put":{"sum":17927,"avg":17927.0,"stddev":0.0,"median":17927.0,"p95":17927.0,"p99":17927.0,"count":1},"compaction":{"sum":0,"avg":0.0,"stddev":0.0,"median":0.0,"p95":0.0,"p99":0.0,"count":0}},"lastCommitLatencyMs":{"fileSync":595,"writeBatch":17,"flush":60,"pause":0,"checkpoint":64,"compact":0},"filesCopied":1,"bytesCopied":1146,"filesReused":0,"zipFileBytesUncompressed":6973,"nativeOpsMetrics":{"writerStallDuration":0,"totalBytesReadThroughIterator":0,"readBlockCacheHitCount":0,"totalBytesWrittenByCompaction":0,"readBlockCacheMissCount":0,"totalBytesReadByCompaction":0,"totalBytesWritten":17,"totalBytesRead":0}}
   ```
   
   seems no difference


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang closed pull request #37480: [DON'T MERGE] Try to replace all `json4s` with `Jackson`

Posted by GitBox <gi...@apache.org>.

LuciferYang closed pull request #37480: [DON'T MERGE] Try to replace all `json4s` with `Jackson`
URL: https://github.com/apache/spark/pull/37480


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #37480: [SPARK-40046][SQL][SS] Use Jackson instead of json4s to serialize `RocksDBMetrics`

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #37480:
URL: https://github.com/apache/spark/pull/37480#issuecomment-1214655019

   It seems that the use of json4s can be divided into the following parts:
   
   - **core**
   ```
   core/src/main/scala/org/apache/spark/TestUtils.scala
   core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala
   core/src/main/scala/org/apache/spark/deploy/JsonProtocol.scala
   core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala
   core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala
   core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionServer.scala
   core/src/main/scala/org/apache/spark/deploy/rest/SubmitRestProtocolMessage.scala
   core/src/main/scala/org/apache/spark/deploy/worker/ui/WorkerPage.scala
   core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala
   core/src/main/scala/org/apache/spark/resource/ResourceInformation.scala
   core/src/main/scala/org/apache/spark/resource/ResourceUtils.scala
   core/src/main/scala/org/apache/spark/ui/JettyUtils.scala
   core/src/main/scala/org/apache/spark/ui/WebUI.scala
   core/src/main/scala/org/apache/spark/util/JsonProtocol.scala
   ```
   
   - **catalyst**
   ```
   sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/trees/TreeNode.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/DataTypeJsonUtils.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/ArrayType.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/DataType.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/MapType.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/Metadata.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructField.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/types/UserDefinedType.scala
   ```
   
   - **sql**
   ```
   sql/core/src/main/scala/org/apache/spark/sql/execution/command/views.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceUtils.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CommitLog.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSinkLog.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceLog.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSourceOffset.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/HDFSMetadataLog.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/OffsetSeq.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/RateStreamOffset.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetadata.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousRateStreamSource.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/ContinuousTextSocketSource.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/ContinuousMemoryStream.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/sources/RatePerMicroBatchStream.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDBFileManager.scala
   sql/core/src/main/scala/org/apache/spark/sql/execution/ui/ExecutionPage.scala
   sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryStatus.scala
   sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala
   ```
   
   - **mllib**
   ```
   mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala:
   mllib/src/main/scala/org/apache/spark/ml/Pipeline.scala
   mllib/src/main/scala/org/apache/spark/ml/classification/DecisionTreeClassifier.scala
   mllib/src/main/scala/org/apache/spark/ml/classification/GBTClassifier.scala
   mllib/src/main/scala/org/apache/spark/ml/classification/NaiveBayes.scala
   mllib/src/main/scala/org/apache/spark/ml/classification/OneVsRest.scala
   mllib/src/main/scala/org/apache/spark/ml/classification/RandomForestClassifier.scala
   mllib/src/main/scala/org/apache/spark/ml/clustering/LDA.scala
   mllib/src/main/scala/org/apache/spark/ml/fpm/FPGrowth.scala
   mllib/src/main/scala/org/apache/spark/ml/linalg/JsonMatrixConverter.scala
   mllib/src/main/scala/org/apache/spark/ml/linalg/JsonVectorConverter.scala
   mllib/src/main/scala/org/apache/spark/ml/param/params.scala:
   mllib/src/main/scala/org/apache/spark/ml/param/params.scala
   mllib/src/main/scala/org/apache/spark/ml/r/AFTSurvivalRegressionWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/ALSWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/BisectingKMeansWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/DecisionTreeClassifierWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/DecisionTreeRegressorWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/FMClassifierWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/FMRegressorWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/FPGrowthWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/GBTClassifierWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/GBTRegressorWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/GaussianMixtureWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/GeneralizedLinearRegressionWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/IsotonicRegressionWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/KMeansWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/LDAWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/LinearRegressionWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/LinearSVCWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/LogisticRegressionWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/MultilayerPerceptronClassifierWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/NaiveBayesWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/RWrappers.scala
   mllib/src/main/scala/org/apache/spark/ml/r/RandomForestClassifierWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/r/RandomForestRegressorWrapper.scala
   mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala
   mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala
   mllib/src/main/scala/org/apache/spark/ml/regression/GBTRegressor.scala
   mllib/src/main/scala/org/apache/spark/ml/regression/RandomForestRegressor.scala
   mllib/src/main/scala/org/apache/spark/ml/tree/treeModels.scala
   mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala:
   mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala
   mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala:
   mllib/src/main/scala/org/apache/spark/ml/tuning/TrainValidationSplit.scala
   mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala:
   mllib/src/main/scala/org/apache/spark/ml/tuning/ValidatorParams.scala
   mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala
   mllib/src/main/scala/org/apache/spark/ml/util/ReadWrite.scala
   mllib/src/main/scala/org/apache/spark/mllib/classification/ClassificationModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/classification/NaiveBayes.scala
   mllib/src/main/scala/org/apache/spark/mllib/classification/impl/GLMClassificationModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/clustering/BisectingKMeansModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixtureModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/clustering/KMeansModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/clustering/PowerIterationClustering.scala
   mllib/src/main/scala/org/apache/spark/mllib/feature/ChiSqSelector.scala
   mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
   mllib/src/main/scala/org/apache/spark/mllib/fpm/FPGrowth.scala
   mllib/src/main/scala/org/apache/spark/mllib/fpm/PrefixSpan.scala
   mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala
   mllib/src/main/scala/org/apache/spark/mllib/recommendation/MatrixFactorizationModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/regression/IsotonicRegression.scala
   mllib/src/main/scala/org/apache/spark/mllib/regression/RegressionModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/regression/impl/GLMRegressionModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/tree/model/DecisionTreeModel.scala
   mllib/src/main/scala/org/apache/spark/mllib/tree/model/treeEnsembleModels.scala
   mllib/src/main/scala/org/apache/spark/mllib/util/modelSaveLoad.scala
   ```
   
   - **kafka**
   
   ```
   connector/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/JsonUtils.scala
   ```
   
   This pr seems to do too small. I'll set it to draft and try more changes
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] JoshRosen commented on pull request #37480: [WIP][SPARK-40046][CORE][SQL][SS] Use Jackson instead of json4s to serialize `RocksDBMetrics`

Posted by GitBox <gi...@apache.org>.

JoshRosen commented on PR #37480:
URL: https://github.com/apache/spark/pull/37480#issuecomment-1215496802

   > In [SPARK-39489](https://github.com/apache/spark/pull/36885), for `Why are the changes needed?`, I found some of the reasons are as follows:
   > 
   > ```
   > In addition, this is a stepping-stone towards eventually being able to remove our Json4s dependency:
   > 
   > Today Spark uses Json4s 3.x and this causes library conflicts for end users who want to upgrade to 4.x; see https://github.com/apache/spark/pull/33630 for one example.
   > To completely remove Json4s we'll need to update several other parts of Spark (including code used for ML model serialization); this PR is just a first step towards that goal if we decide to pursue it.
   > In this PR, I continue to use Json4s in test code; I think it's fine to keep Json4s as a test-only dependency.
   > ```
   > 
   > I'm not sure if @JoshRosen has plans for the next step and the overall blueprint for this. I'm just learning this [SPARK-39489](https://github.com/apache/spark/pull/36885) and trying to start with some simple cases. Similarly, I submitted another pr: #37515
   
   @LuciferYang, my change in https://github.com/apache/spark/pull/36885 was primarily motivated by History Server performance. Although the unblocking of Json4s removal is a nice secondary benefit, I don't think that removal is a super high priority. There's also some burdens in terms of testing to ensure that the old and new JSON is fully cross-compatible (in cases where it needs to be). Aside from performance-sensitive places, there's limited benefit from _partial_ removal of Json4s: I think the big user-facing benefits would be achieved only when users are free to use any version of Json4s because Spark drops its dependency. Therefore, I think we should do some more analysis to confirm that it's technically possible to fully remove Json4s (and that doing so is safe / desirable) before we start merging these smaller piece-by-piece removals.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #37480: [SPARK-40046][SQL] Use Jackson instead of json4s to serialize `RocksDBMetrics`

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #37480:
URL: https://github.com/apache/spark/pull/37480#issuecomment-1211638145

   Although this is a minor change, in order not to have a negative impact, I simply compared the performance as follows:
   
   **Bench Code**
   
   ```
   val valuesPerIteration = 100000
   
       val jsonString =
         """{"numCommittedKeys":1,"numUncommittedKeys":1,"totalMemUsageBytes":3976,
           |"writeBatchMemUsageBytes":17,"totalSSTFilesBytes":1146,
           |"nativeOpsHistograms":{"get":{"sum":7,"avg":3.5,"stddev":0.5,
           |"median":3.0,"p95":3.9,"p99":3.98,"count":2},"put":{"sum":17927,
           |"avg":17927.0,"stddev":0.0,"median":17927.0,"p95":17927.0,
           |"p99":17927.0,"count":1},"compaction":{"sum":0,"avg":0.0,
           |"stddev":0.0,"median":0.0,"p95":0.0,"p99":0.0,"count":0}},
           |"lastCommitLatencyMs":{"fileSync":595,"writeBatch":17,
           |"flush":60,"pause":0,"checkpoint":64,"compact":0},"filesCopied":1,
           |"bytesCopied":1146,"filesReused":0,"zipFileBytesUncompressed":6973,
           |"nativeOpsMetrics":{"writerStallDuration":0,"totalBytesReadThroughIterator":0,
           |"readBlockCacheHitCount":0,"totalBytesWrittenByCompaction":0,
           |"readBlockCacheMissCount":0,"totalBytesReadByCompaction":0,
           |"totalBytesWritten":17,"totalBytesRead":0}}""".stripMargin
   
       val metrics: RocksDBMetrics = new ObjectMapper().registerModule(DefaultScalaModule)
         .readValue(jsonString, classOf[RocksDBMetrics])
   
       val benchmark = new Benchmark("Test RocksDBMetrics to Json",
         valuesPerIteration, output = output)
   
       benchmark.addCase("Use Json4s") { _: Int =>
         for (_ <- 0L until valuesPerIteration) {
           metrics.json
         }
       }
   
   
       val mapper = new ObjectMapper().registerModule(DefaultScalaModule)
       benchmark.addCase("Use Jackson") { _: Int =>
         for (_ <- 0L until valuesPerIteration) {
           mapper.writeValueAsString(metrics)
         }
       }
   
       benchmark.run()
   ```
   
   **Java8**
   ```
   OpenJDK 64-Bit Server VM 1.8.0_345-b01 on Linux 5.15.0-1014-azure
   Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
   Test RocksDBMetrics to Json:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   ------------------------------------------------------------------------------------------------------------------------
   Use Json4s                                         4078           4091          17          0.0       40785.0       1.0X
   Use Jackson                                         384            386           1          0.3        3837.3      10.6X
   ```
   **Java11**
   ```
   OpenJDK 64-Bit Server VM 11.0.16+8-LTS on Linux 5.15.0-1014-azure
   Intel(R) Xeon(R) Platinum 8370C CPU @ 2.80GHz
   Test RocksDBMetrics to Json:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   ------------------------------------------------------------------------------------------------------------------------
   Use Json4s                                         3361           3362           2          0.0       33606.9       1.0X
   Use Jackson                                         361            362           1          0.3        3610.3       9.3X
   ```
   **Java17**
   ```
   OpenJDK 64-Bit Server VM 17.0.4+8-LTS on Linux 5.15.0-1014-azure
   Intel(R) Xeon(R) Platinum 8272CL CPU @ 2.60GHz
   Test RocksDBMetrics to Json:              Best Time(ms)   Avg Time(ms)   Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   ------------------------------------------------------------------------------------------------------------------------
   Use Json4s                                         3787           3792           7          0.0       37870.2       1.0X
   Use Jackson                                         362            367           5          0.3        3618.8      10.5X
   ```
   Jackson looks better, though that's not the key point
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang closed pull request #37480: [WIP][SPARK-40046][CORE][SQL][SS] Use Jackson instead of json4s to serialize `RocksDBMetrics`

Posted by GitBox <gi...@apache.org>.

LuciferYang closed pull request #37480: [WIP][SPARK-40046][CORE][SQL][SS] Use Jackson instead of json4s to serialize `RocksDBMetrics`
URL: https://github.com/apache/spark/pull/37480


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #37480: [SPARK-40046][SQL][SS] Use Jackson instead of json4s to serialize `RocksDBMetrics`

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #37480:
URL: https://github.com/apache/spark/pull/37480#issuecomment-1214553834

   In [SPARK-39489](https://github.com/apache/spark/pull/36885),  for `Why are the changes needed?`, some of the reasons are as follows:
   
   ```
   In addition, this is a stepping-stone towards eventually being able to remove our Json4s dependency:
   
   Today Spark uses Json4s 3.x and this causes library conflicts for end users who want to upgrade to 4.x; see https://github.com/apache/spark/pull/33630 for one example.
   To completely remove Json4s we'll need to update several other parts of Spark (including code used for ML model serialization); this PR is just a first step towards that goal if we decide to pursue it.
   In this PR, I continue to use Json4s in test code; I think it's fine to keep Json4s as a test-only dependency.
   ```
   
   I'm not sure if @JoshRosen  has plans for the next step and the overall blueprint for this. I'm just learning this [SPARK-39489](https://github.com/apache/spark/pull/36885)  and trying to start with some simple cases. Similarly, I submitted another pr: https://github.com/apache/spark/pull/37515


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #37480: [WIP][SPARK-40046][CORE][SQL][SS] Use Jackson instead of json4s to serialize `RocksDBMetrics`

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on PR #37480:
URL: https://github.com/apache/spark/pull/37480#issuecomment-1215537341

   @JoshRosen Thanks for your reply. I will close this pr first and do some research work by myself.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org