You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by ka...@apache.org on 2023/02/24 22:54:46 UTC
[spark] branch master updated: [SPARK-42565][SS] Error log improvement for the lock acquisition of RocksDB state store instance
This is an automated email from the ASF dual-hosted git repository.
kabhwan pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push:
new ac30e9312a8 [SPARK-42565][SS] Error log improvement for the lock acquisition of RocksDB state store instance
ac30e9312a8 is described below
commit ac30e9312a89ca16c16ef27c9276b2bb21cd7431
Author: Huanli Wang <hu...@databricks.com>
AuthorDate: Sat Feb 25 07:54:18 2023 +0900
[SPARK-42565][SS] Error log improvement for the lock acquisition of RocksDB state store instance
```
"23/02/23 23:57:44 INFO Executor: Running task 2.0 in stage 57.1 (TID 363)
"23/02/23 23:58:44 ERROR RocksDB StateStoreId(opId=0,partId=3,name=default): RocksDB instance
could not be acquired by [ThreadId: Some(49), task: 3.0 in stage 57, TID 363] as it was not released by
[ThreadId: Some(51), task: 3.1 in stage 57, TID 342] after 60002 ms.
```
We are seeing those error messages for a testing query. The `taskId != partitionId` but we fail to be clear on this in the error log.
It's confusing when we see those logs: the second log entry seems to talk about `task 3.0` (it's actually partition 3 and retry attempt 0), but the `TID 363` is already occupied by `task 2.0 in stage 57.1`.
Also, it's unclear at which stage retry attempt, the lock is acquired (or fails to be acquired)
### What changes were proposed in this pull request?
* add `partition ` after `task: ` in the log message for clarification
* add stage attempt to distinguish different stage retries.
### Why are the changes needed?
improve the log message for a better debuggability
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
only log message change
Closes #40161 from huanliwang-db/rocksdb.
Authored-by: Huanli Wang <hu...@databricks.com>
Signed-off-by: Jungtaek Lim <ka...@gmail.com>
---
.../scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
index cab2fe9b90d..32caf8a1bc8 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/RocksDB.scala
@@ -710,7 +710,8 @@ case class AcquiredThreadInfo() {
override def toString(): String = {
val taskStr = if (tc != null) {
val taskDetails =
- s"${tc.partitionId}.${tc.attemptNumber} in stage ${tc.stageId}, TID ${tc.taskAttemptId}"
+ s"partition ${tc.partitionId}.${tc.attemptNumber} in stage " +
+ s"${tc.stageId}.${tc.stageAttemptNumber()}, TID ${tc.taskAttemptId}"
s", task: $taskDetails"
} else ""
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org