You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by yl...@apache.org on 2016/10/22 08:59:47 UTC
spark git commit: [SPARK-17986][ML] SQLTransformer should remove
temporary tables
Repository: spark
Updated Branches:
refs/heads/master 01b26a064 -> ab3363e9f
[SPARK-17986][ML] SQLTransformer should remove temporary tables
## What changes were proposed in this pull request?
A call to the method `SQLTransformer.transform` previously would create a temporary table and never delete it. This change adds a call to `dropTempView()` that deletes this temporary table before returning the result so that the table will not remain in spark's table catalog. Because `tableName` is randomized and not exposed, there should be no expected use of this table outside of the `transform` method.
## How was this patch tested?
A single new assertion was added to the existing test of the `SQLTransformer.transform` method that all temporary tables are removed. Without the corresponding code change, this new assertion fails. I am not aware of any circumstances in which removing this temporary view would be bad for performance or correctness in other ways, but some expertise here would be helpful.
Author: Drew Robb <dr...@gmail.com>
Closes #15526 from drewrobb/SPARK-17986.
Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ab3363e9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ab3363e9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ab3363e9
Branch: refs/heads/master
Commit: ab3363e9f6b1f7fc26682509fe7382c570f91778
Parents: 01b26a0
Author: Drew Robb <dr...@gmail.com>
Authored: Sat Oct 22 01:59:36 2016 -0700
Committer: Yanbo Liang <yb...@gmail.com>
Committed: Sat Oct 22 01:59:36 2016 -0700
----------------------------------------------------------------------
.../main/scala/org/apache/spark/ml/feature/SQLTransformer.scala | 4 +++-
.../scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala | 1 +
2 files changed, 4 insertions(+), 1 deletion(-)
----------------------------------------------------------------------
http://git-wip-us.apache.org/repos/asf/spark/blob/ab3363e9/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
----------------------------------------------------------------------
diff --git a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
index 259be26..b25fff9 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/SQLTransformer.scala
@@ -67,7 +67,9 @@ class SQLTransformer @Since("1.6.0") (@Since("1.6.0") override val uid: String)
val tableName = Identifiable.randomUID(uid)
dataset.createOrReplaceTempView(tableName)
val realStatement = $(statement).replace(tableIdentifier, tableName)
- dataset.sparkSession.sql(realStatement)
+ val result = dataset.sparkSession.sql(realStatement)
+ dataset.sparkSession.catalog.dropTempView(tableName)
+ result
}
@Since("1.6.0")
http://git-wip-us.apache.org/repos/asf/spark/blob/ab3363e9/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
----------------------------------------------------------------------
diff --git a/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala b/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
index 2346407..753f890 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/feature/SQLTransformerSuite.scala
@@ -43,6 +43,7 @@ class SQLTransformerSuite
assert(result.schema.toString == resultSchema.toString)
assert(resultSchema == expected.schema)
assert(result.collect().toSeq == expected.collect().toSeq)
+ assert(original.sparkSession.catalog.listTables().count() == 0)
}
test("read/write") {
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org