You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/06/25 15:52:05 UTC

[jira] [Resolved] (SPARK-8629) R code in SparkR

     [ https://issues.apache.org/jira/browse/SPARK-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-8629.
------------------------------
    Resolution: Invalid

> R code in SparkR
> ----------------
>
>                 Key: SPARK-8629
>                 URL: https://issues.apache.org/jira/browse/SPARK-8629
>             Project: Spark
>          Issue Type: Question
>          Components: R
>            Reporter: Arun
>            Priority: Minor
>
> Data set:  
>   
> DC_City  	Dc_Code	ItemNo  	Itemdescription	                dat   Month	Year	SalesQuantity 
> Hyderabad	11	100005010	more. Value Chana Dal 1 Kg.	9/16/2012	9-Sep 2012	 1 
> Hyderabad	11	100005010	more. Value Chana Dal 1 Kg.	12/21/2012	12-Dec2012 1 
> Hyderabad	11	100005010	more. Value Chana Dal 1 Kg.	1/12/2013	1-Jan	2013	 1 
> Hyderabad	11	100005010	more. Value Chana Dal 1 Kg.	1/27/2013	1-Jan	2013	 3 
> Hyderabad	11	100005011	more. Value Chana Dal 1 Kg.	2/1/2013	2-Feb	2013	 2 
> Hyderabad	11	100005011	more. Value Chana Dal 1 Kg.	2/12/2013	2-Feb	2013	 3 
> Hyderabad	11	100005011	more. Value Chana Dal 1 Kg.	2/13/2013	2-Feb	2013	 2 
> Hyderabad	11	100005011	more. Value Chana Dal 1 Kg.	2/14/2013	2-Feb	2013	 1 
> Hyderabad	11	100005011	more. Value Chana Dal 1 Kg.	2/15/2013	2-Feb	2013	 8 
> Hyderabad	11	100005012	more. Value Chana Dal 1 Kg.	2/16/2013	2-Feb	2013	 18 
> Hyderabad	11	100005012	more. Value Chana Dal 1 Kg.	2/17/2013	2-Feb	2013	 19 
> Hyderabad	11	100005012	more. Value Chana Dal 1 Kg.	2/18/2013	2-Feb	2013	 18 
> Hyderabad	11	100005012	more. Value Chana Dal 1 Kg.	2/19/2013	2-Feb	2013	 18 
> Hyderabad	11	100005012	more. Value Chana Dal 1 Kg.	2/20/2013	2-Feb	2013	 16 
> Hyderabad	11	100005013	more. Value Chana Dal 1 Kg.	2/21/2013	2-Feb	2013	 25 
> Hyderabad	11	100005013	more. Value Chana Dal 1 Kg.	2/22/2013	2-Feb	2013	 19 
> Hyderabad	11	100005013	more. Value Chana Dal 1 Kg.	2/23/2013	2-Feb	2013	 17 
> Hyderabad	11	100005013	more. Value Chana Dal 1 Kg.	2/24/2013	2-Feb	2013	 39 
> Hyderabad	11	100005013	more. Value Chana Dal 1 Kg.	2/25/2013	2-Feb	2013	 23 
> Code i used in R:
>  > data <- read.csv("D:/R/Data_sale_quantity.csv" ,stringsAsFactors=FALSE) 
>  >factors <- unique(data$ItemNo) 
>  > df.allitems <- data.frame() 
>  > for(i in 1:length(factors)) 
>  
>  {                                                                                                                                      
>    >data1 <-  filter(data, ItemNo  == factors[[i]]) 
>  >data2<- select(data1,DC_City,Itemdescription,ItemNo,date,Year,SalesQuantity) 
>  >date2$date <- as.Date(date2$date, format = "%m/%d/%y")  
>  >data3 <- data2[order(data2$date), ]  
>  df.allitems <- rbind(data3 , df.allitems)  # Append by row bind 
>   }                                                                                                                                              
>   
>  > write.csv(df.allitems,"E:/all_items.csv") 
> You can see the code clearly in -
> ---------------------------------------------------------------------------------------------
> http://apache-spark-user-list.1001560.n3.nabble.com/Convert-R-code-into-SparkR-code-for-spark-1-4-version-tp23489.html
> ---------------------------------------------------------------------------------------------
>   
> I have done some SparkR code: 
>   data1 <- read.csv("D:/Data_sale_quantity_mini.csv") # read in R 
>   df_1 <- createDataFrame(sqlContext, data2) # converts Rdata.frame to spark DF 
>   factors <- distinct(df_1) # removed duplicates 
>   
> #for select i used: 
>   df_2 <- select(distinctDF ,"DC_City","Itemdescription","ItemNo","date","Year","SalesQuantity") # select action 
> I dont know how to: 
>   1) create a empty sparkR DF 
>   2) Using for loop in SparkR 
>   3) change the date format. 
>   4) find the lenght() in spark df 
>   5) using rbind in sparkR 
>   
> can you help me out in doing the above code in sparkR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org