You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "peay (JIRA)" <ji...@apache.org> on 2017/07/27 17:00:09 UTC
[jira] [Created] (SPARK-21550) approxQuantiles throws "next on
empty iterator" on empty data
peay created SPARK-21550:
----------------------------
Summary: approxQuantiles throws "next on empty iterator" on empty data
Key: SPARK-21550
URL: https://issues.apache.org/jira/browse/SPARK-21550
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 2.1.0
Reporter: peay
The documentation says:
{code}
null and NaN values will be removed from the numerical column before calculation. If
the dataframe is empty or the column only contains null or NaN, an empty array is returned.
{code}
However, this small pyspark example
{code}
sql_context.range(10).filter(col("id") == 42).approxQuantile("id", [0.99], 0.001)
{code}
throws
{code}
Py4JJavaError: An error occurred while calling o3493.approxQuantile.
: java.util.NoSuchElementException: next on empty iterator
at scala.collection.Iterator$$anon$2.next(Iterator.scala:39)
at scala.collection.Iterator$$anon$2.next(Iterator.scala:37)
at scala.collection.IndexedSeqLike$Elements.next(IndexedSeqLike.scala:63)
at scala.collection.IterableLike$class.head(IterableLike.scala:107)
at scala.collection.mutable.ArrayOps$ofRef.scala$collection$IndexedSeqOptimized$$super$head(ArrayOps.scala:186)
at scala.collection.IndexedSeqOptimized$class.head(IndexedSeqOptimized.scala:126)
at scala.collection.mutable.ArrayOps$ofRef.head(ArrayOps.scala:186)
at scala.collection.TraversableLike$class.last(TraversableLike.scala:431)
at scala.collection.mutable.ArrayOps$ofRef.scala$collection$IndexedSeqOptimized$$super$last(ArrayOps.scala:186)
at scala.collection.IndexedSeqOptimized$class.last(IndexedSeqOptimized.scala:132)
at scala.collection.mutable.ArrayOps$ofRef.last(ArrayOps.scala:186)
at org.apache.spark.sql.catalyst.util.QuantileSummaries.query(QuantileSummaries.scala:207)
at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$multipleApproxQuantiles$1$$anonfun$apply$1.apply$mcDD$sp(StatFunctions.scala:92)
at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$multipleApproxQuantiles$1$$anonfun$apply$1.apply(StatFunctions.scala:92)
at org.apache.spark.sql.execution.stat.StatFunctions$$anonfun$multipleApproxQuantiles$1$$anonfun$apply$1.apply(StatFunctions.scala:92)
{code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org