You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/04/02 06:14:36 UTC

[GitHub] [spark] MaxGekk opened a new pull request #32035: [SPARK-34938][SQL] Benchmark only legacy interval in `ExtractBenchmark`

MaxGekk opened a new pull request #32035:
URL: https://github.com/apache/spark/pull/32035


   ### What changes were proposed in this pull request?
   In the PR, I propose to disable ANSI intervals as the result of dates/timestamp subtraction in `ExtractBenchmark` and benchmark only legacy intervals because `EXTRACT( .. FROM ..)` doesn't support ANSI intervals so far.
   
   ### Why are the changes needed?
   This fixes the benchmark failure:
   ```
   [info]   Running case: YEAR of interval
   [error] Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve 'year((subtractdates(CAST(timestamp_seconds(id) AS DATE), DATE '0001-01-01') + subtracttimestamps(timestamp_seconds(id), TIMESTAMP '1000-01-01 01:02:03.123456')))' due to data type mismatch: argument 1 requires date type, however, '(subtractdates(CAST(timestamp_seconds(id) AS DATE), DATE '0001-01-01') + subtracttimestamps(timestamp_seconds(id), TIMESTAMP '1000-01-01 01:02:03.123456'))' is of day-time interval type.; line 1 pos 0;
   [error] 'Project [extract(YEAR, (subtractdates(cast(timestamp_seconds(id#1456L) as date), 0001-01-01, false) + subtracttimestamps(timestamp_seconds(id#1456L), 1000-01-01 01:02:03.123456, false, Some(Europe/Moscow)))) AS YEAR#1458]
   [error] +- Range (1262304000, 1272304000, step=1, splits=Some(1))
   [error] 	at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
   [error] 	at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$$nestedInanonfun$checkAnalysis$1$2.applyOrElse(CheckAnalysis.scala:194)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   By running the `ExtractBenchmark` benchmark via:
   ```
   $ build/sbt "sql/test:runMain org.apache.spark.sql.execution.benchmark.ExtractBenchmark"
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] MaxGekk commented on pull request #32035: [SPARK-34938][SQL][TESTS] Benchmark only legacy interval in `ExtractBenchmark`

Posted by GitBox <gi...@apache.org>.
MaxGekk commented on pull request #32035:
URL: https://github.com/apache/spark/pull/32035#issuecomment-812345209


   @HyukjinKwon Could you review this PR, please.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #32035: [SPARK-34938][SQL][TESTS] Benchmark only legacy interval in `ExtractBenchmark`

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #32035:
URL: https://github.com/apache/spark/pull/32035#issuecomment-812353890


   None of tests actually verfiies this changes except comliation and linter which passed.
   
   Merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #32035: [SPARK-34938][SQL][TESTS] Benchmark only legacy interval in `ExtractBenchmark`

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #32035:
URL: https://github.com/apache/spark/pull/32035#discussion_r606093476



##########
File path: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/ExtractBenchmark.scala
##########
@@ -39,7 +39,9 @@ object ExtractBenchmark extends SqlBasedBenchmark {
 
   private def doBenchmark(cardinality: Long, exprs: String*): Unit = {
     val sinceSecond = Instant.parse("2010-01-01T00:00:00Z").getEpochSecond
-    withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {
+    withSQLConf(
+      SQLConf.LEGACY_INTERVAL_ENABLED.key -> "true",
+      SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {

Review comment:
       BTW, I asked (offline) to don't generate the benchmark results because I plan to regenerate everything after https://github.com/apache/spark/pull/32015




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #32035: [SPARK-34938][SQL][TESTS] Benchmark only legacy interval in `ExtractBenchmark`

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #32035:
URL: https://github.com/apache/spark/pull/32035


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org