You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by sa...@apache.org on 2021/08/18 04:34:57 UTC
[spark] branch branch-3.1 updated:
[SPARK-36400][SPARK-36398][SQL][WEBUI] Make ThriftServer recognize
spark.sql.redaction.string.regex
This is an automated email from the ASF dual-hosted git repository.
sarutak pushed a commit to branch branch-3.1
in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/branch-3.1 by this push:
new 31d771d [SPARK-36400][SPARK-36398][SQL][WEBUI] Make ThriftServer recognize spark.sql.redaction.string.regex
31d771d is described below
commit 31d771dcf242cfa477b04f28950526bf87b7e90a
Author: Kousuke Saruta <sa...@oss.nttdata.com>
AuthorDate: Wed Aug 18 13:31:22 2021 +0900
[SPARK-36400][SPARK-36398][SQL][WEBUI] Make ThriftServer recognize spark.sql.redaction.string.regex
### What changes were proposed in this pull request?
This PR fixes an issue that ThriftServer doesn't recognize `spark.sql.redaction.string.regex`.
The problem is that sensitive information included in queries can be exposed.
![thrift-password1](https://user-images.githubusercontent.com/4736016/129440772-46379cc5-987b-41ac-adce-aaf2139f6955.png)
![thrift-password2](https://user-images.githubusercontent.com/4736016/129440775-fd328c0f-d128-4a20-82b0-46c331b9fd64.png)
### Why are the changes needed?
Bug fix.
### Does this PR introduce _any_ user-facing change?
No.
### How was this patch tested?
Ran ThriftServer, connect to it and execute `CREATE TABLE mytbl2(a int) OPTIONS(url="jdbc:mysql//example.com:3306", driver="com.mysql.jdbc.Driver", dbtable="test_tbl", user="test_usr", password="abcde");` with `spark.sql.redaction.string.regex=((?i)(?<=password=))(".*")|('.*')`
Then, confirmed UI.
![thrift-hide-password1](https://user-images.githubusercontent.com/4736016/129440863-cabea247-d51f-41a4-80ac-6c64141e1fb7.png)
![thrift-hide-password2](https://user-images.githubusercontent.com/4736016/129440874-96cd0f0c-720b-4010-968a-cffbc85d2be5.png)
Closes #33743 from sarutak/thrift-redact.
Authored-by: Kousuke Saruta <sa...@oss.nttdata.com>
Signed-off-by: Kousuke Saruta <sa...@oss.nttdata.com>
(cherry picked from commit b914ff7d54bd7c07e7313bb06a1fa22c36b628d2)
Signed-off-by: Kousuke Saruta <sa...@oss.nttdata.com>
---
.../spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala
index f7a4be9..acb00e4 100644
--- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala
+++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkExecuteStatementOperation.scala
@@ -220,10 +220,11 @@ private[hive] class SparkExecuteStatementOperation(
override def runInternal(): Unit = {
setState(OperationState.PENDING)
logInfo(s"Submitting query '$statement' with $statementId")
+ val redactedStatement = SparkUtils.redact(sqlContext.conf.stringRedactionPattern, statement)
HiveThriftServer2.eventManager.onStatementStart(
statementId,
parentSession.getSessionHandle.getSessionId.toString,
- statement,
+ redactedStatement,
statementId,
parentSession.getUsername)
setHasResultSet(true) // avoid no resultset for async run
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org