You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "HeartSaVioR (via GitHub)" <gi...@apache.org> on 2023/07/16 08:35:53 UTC

[GitHub] [spark] HeartSaVioR commented on a diff in pull request #42012: [SPARK-44440][SS] Use thread pool to perform maintenance activity for hdfs/rocksdb state store providers

HeartSaVioR commented on code in PR #42012:
URL: https://github.com/apache/spark/pull/42012#discussion_r1264640056


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1852,6 +1852,15 @@ object SQLConf {
       .createWithDefault(
         "org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider")
 
+  val STATE_STORE_MAINTENANCE_THREADS =
+    buildConf("spark.sql.streaming.stateStore.stateStoreMaintenanceThreads")
+      .internal()
+      .doc("Number of threads in the thread pool that perform clean up and snapshotting tasks " +
+        "for stateful streaming queries.")
+      .intConf
+      .checkValue(_ >= 0, "Must not be negative")

Review Comment:
   Makes sense. I don't see the case where people wants to disable the maintenance task at all. Actually, running maintenance task is a must, otherwise snapshotting does not trigger which leads the query to be loaded forever for restoring the state from all deltas.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org