You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/07/09 13:46:10 UTC

[jira] [Assigned] (SPARK-16461) Support partition batch pruning with `<=>` (EqualNullSafe) predicate in InMemoryTableScanExec

     [ https://issues.apache.org/jira/browse/SPARK-16461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-16461:
------------------------------------

    Assignee: Apache Spark

> Support partition batch pruning with `<=>` (EqualNullSafe) predicate in InMemoryTableScanExec
> ---------------------------------------------------------------------------------------------
>
>                 Key: SPARK-16461
>                 URL: https://issues.apache.org/jira/browse/SPARK-16461
>             Project: Spark
>          Issue Type: Improvement
>            Reporter: Hyukjin Kwon
>            Assignee: Apache Spark
>
> It seems `EqualNullSafe` filter was missed for batch pruneing partitions in cached tables.
> Supporting this improve the performance roughly ~75% (it will vary).
> Running the codes below:
> {code}
> test("Null-safe equal comparison") {
>   val N = 20000000
>   val df = spark.range(N).repartition(20)
>   val benchmark = new Benchmark("Null-safe equal comparison", N)
>   df.createOrReplaceTempView("t")
>   spark.catalog.cacheTable("t")
>   sql("select id from t where id <=> 1").collect()
>   benchmark.addCase("Null-safe equal comparison", 10) { _ =>
>     sql("select id from t where id <=> 1").collect()
>   }
>   benchmark.run()
> }
> {code}
> produces the results below:
> Before:
> {code}
> Running benchmark: Null-safe equal comparison
>   Running case: Null-safe equal comparison
>   Stopped after 10 iterations, 2098 ms
> Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14 on Mac OS X 10.11.5
> Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
> Null-safe equal comparison:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
> ------------------------------------------------------------------------------------------------
> Null-safe equal comparison                     204 /  210         98.1          10.2       1.0X
> {code}
> After
> {code}
> Running benchmark: Null-safe equal comparison
>   Running case: Null-safe equal comparison
>   Stopped after 10 iterations, 478 ms
> Java HotSpot(TM) 64-Bit Server VM 1.8.0_45-b14 on Mac OS X 10.11.5
> Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz
> Null-safe equal comparison:              Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
> ------------------------------------------------------------------------------------------------
> Null-safe equal comparison                      42 /   48        474.1           2.1       1.0X
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org