You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Asif Khan (JIRA)" <ji...@apache.org> on 2018/09/22 14:41:00 UTC
[jira] [Created] (SPARK-25512) Using RowNumbers in SparkR Dataframe
Asif Khan created SPARK-25512:
---------------------------------
Summary: Using RowNumbers in SparkR Dataframe
Key: SPARK-25512
URL: https://issues.apache.org/jira/browse/SPARK-25512
Project: Spark
Issue Type: Bug
Components: SparkR
Affects Versions: 2.3.1
Reporter: Asif Khan
Hi,
I have a use case , where I have a SparkR dataframe and i want to iterate over the dataframe in a for loop using the row numbers of the dataframe. Is it possible?
Only solution I have now is to collect() the SparkR dataframe in R dataframe , which brings the entire dataframe on Driver node and then iterate over it using row numbers. But as the for loop executes only on driver node, I don't get the advantage of parallel processing in Spark which was the whole purpose of using Spark. Please Help.
Thank You,
Asif Khan
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org