You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/01/03 10:51:00 UTC
[jira] [Commented] (SPARK-26433) Tail method for spark DataFrame
[ https://issues.apache.org/jira/browse/SPARK-26433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16732895#comment-16732895 ]
Hyukjin Kwon commented on SPARK-26433:
--------------------------------------
There are few potential workarounds. For instnace, http://www.swi.com/spark-rdd-getting-bottom-records/ or {{df.sort($"ColumnName".desc).show()}}.
BTW, usually tail or head are used in Scala as below (IMHO):
{code}
scala> Seq(1, 2, 3).tail
res10: Seq[Int] = List(2, 3)
scala> Seq(1, 2, 3).head
res11: Int = 1
{code}
> Tail method for spark DataFrame
> -------------------------------
>
> Key: SPARK-26433
> URL: https://issues.apache.org/jira/browse/SPARK-26433
> Project: Spark
> Issue Type: New Feature
> Components: PySpark
> Affects Versions: 2.4.0
> Reporter: Jan Gorecki
> Priority: Major
>
> There is a head method for spark dataframes which work fine but there doesn't seems to be tail method.
> ```
> >>> ans
> DataFrame[v1: bigint]
> >>> ans.head(3)
> [Row(v1=299443), Row(v1=299493), Row(v1=300751)]
> >>> ans.tail(3)
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/home/jan/git/db-benchmark/spark/py-spark/lib/python3.6/site-packages/py
> spark/sql/dataframe.py", line 1300, in __getattr__
> "'%s' object has no attribute '%s'" % (self.__class__.__name__, name))
> AttributeError: 'DataFrame' object has no attribute 'tail'
> ```
> I would like to feature request Tail method for spark dataframe
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org