You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/09/11 08:59:20 UTC
[jira] [Resolved] (SPARK-17439) QuantilesSummaries returns the
wrong result after compression
[ https://issues.apache.org/jira/browse/SPARK-17439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-17439.
-------------------------------
Resolution: Fixed
Assignee: Tim Hunter
Fix Version/s: 2.1.0
2.0.1
Resolved by https://github.com/apache/spark/pull/15002
> QuantilesSummaries returns the wrong result after compression
> -------------------------------------------------------------
>
> Key: SPARK-17439
> URL: https://issues.apache.org/jira/browse/SPARK-17439
> Project: Spark
> Issue Type: Bug
> Reporter: Tim Hunter
> Assignee: Tim Hunter
> Labels: correctness
> Fix For: 2.0.1, 2.1.0
>
>
> [~clockfly] found the following corner case that returns the wrong quantile (off by 1):
> {code}
> test("test QuantileSummaries compression") {
> var left = new QuantileSummaries(10000, 0.0001)
> System.out.println("LEFT RIGHT")
> System.out.println("====================")
> (0 to 10).foreach { index =>
> left = left.insert(index)
> left = left.compress()
> var right = new QuantileSummaries(10000, 0.0001)
> (0 to index).foreach(right.insert(_))
> right = right.compress()
> System.out.println(s"${left.query(0.5)} ${right.query(0.5)}")
> }
> }
> {code}
> The result is:
> {code}
> LEFT RIGHT
> ====================
> 0.0 0.0
> 0.0 1.0
> 0.0 1.0
> 0.0 1.0
> 1.0 2.0
> 1.0 2.0
> 2.0 3.0
> 2.0 3.0
> 3.0 4.0
> 3.0 4.0
> 4.0 5.0
> {code}
> The value of the "LEFT" column represents the output when using QuantileSummaries in Window function, the value on the "RIGHT" column represents the expected result. The different between "LEFT" and "RIGHT" column is that the "LEFT" column does intermediate compression on the storage of QuantileSummaries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org