You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/06/25 15:52:05 UTC
[jira] [Resolved] (SPARK-8629) R code in SparkR
[ https://issues.apache.org/jira/browse/SPARK-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-8629.
------------------------------
Resolution: Invalid
> R code in SparkR
> ----------------
>
> Key: SPARK-8629
> URL: https://issues.apache.org/jira/browse/SPARK-8629
> Project: Spark
> Issue Type: Question
> Components: R
> Reporter: Arun
> Priority: Minor
>
> Data set:
>
> DC_City Dc_Code ItemNo Itemdescription dat Month Year SalesQuantity
> Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. 9/16/2012 9-Sep 2012 1
> Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. 12/21/2012 12-Dec2012 1
> Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. 1/12/2013 1-Jan 2013 1
> Hyderabad 11 100005010 more. Value Chana Dal 1 Kg. 1/27/2013 1-Jan 2013 3
> Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. 2/1/2013 2-Feb 2013 2
> Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. 2/12/2013 2-Feb 2013 3
> Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. 2/13/2013 2-Feb 2013 2
> Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. 2/14/2013 2-Feb 2013 1
> Hyderabad 11 100005011 more. Value Chana Dal 1 Kg. 2/15/2013 2-Feb 2013 8
> Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. 2/16/2013 2-Feb 2013 18
> Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. 2/17/2013 2-Feb 2013 19
> Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. 2/18/2013 2-Feb 2013 18
> Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. 2/19/2013 2-Feb 2013 18
> Hyderabad 11 100005012 more. Value Chana Dal 1 Kg. 2/20/2013 2-Feb 2013 16
> Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. 2/21/2013 2-Feb 2013 25
> Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. 2/22/2013 2-Feb 2013 19
> Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. 2/23/2013 2-Feb 2013 17
> Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. 2/24/2013 2-Feb 2013 39
> Hyderabad 11 100005013 more. Value Chana Dal 1 Kg. 2/25/2013 2-Feb 2013 23
> Code i used in R:
> > data <- read.csv("D:/R/Data_sale_quantity.csv" ,stringsAsFactors=FALSE)
> >factors <- unique(data$ItemNo)
> > df.allitems <- data.frame()
> > for(i in 1:length(factors))
>
> {
> >data1 <- filter(data, ItemNo == factors[[i]])
> >data2<- select(data1,DC_City,Itemdescription,ItemNo,date,Year,SalesQuantity)
> >date2$date <- as.Date(date2$date, format = "%m/%d/%y")
> >data3 <- data2[order(data2$date), ]
> df.allitems <- rbind(data3 , df.allitems) # Append by row bind
> }
>
> > write.csv(df.allitems,"E:/all_items.csv")
> You can see the code clearly in -
> ---------------------------------------------------------------------------------------------
> http://apache-spark-user-list.1001560.n3.nabble.com/Convert-R-code-into-SparkR-code-for-spark-1-4-version-tp23489.html
> ---------------------------------------------------------------------------------------------
>
> I have done some SparkR code:
> data1 <- read.csv("D:/Data_sale_quantity_mini.csv") # read in R
> df_1 <- createDataFrame(sqlContext, data2) # converts Rdata.frame to spark DF
> factors <- distinct(df_1) # removed duplicates
>
> #for select i used:
> df_2 <- select(distinctDF ,"DC_City","Itemdescription","ItemNo","date","Year","SalesQuantity") # select action
> I dont know how to:
> 1) create a empty sparkR DF
> 2) Using for loop in SparkR
> 3) change the date format.
> 4) find the lenght() in spark df
> 5) using rbind in sparkR
>
> can you help me out in doing the above code in sparkR.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org