You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Franco Barrientos <fr...@exalitica.com> on 2014/11/11 20:11:09 UTC

S3 table to spark sql

How can i create a date field in spark sql? I have a S3 table and  i load it
into a RDD.

 

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

import sqlContext.createSchemaRDD

 

case class trx_u3m(id: String, local: String, fechau3m: String, rubro: Int,
sku: String, unidades: Double, monto: Double)

 

val tabla =
sc.textFile("s3n://exalitica.com/trx_u3m/trx_u3m.txt").map(_.split(",")).map
(p => trx_u3m(p(0).trim.toString, p(1).trim.toString, p(2).trim.toString,
p(3).trim.toInt, p(4).trim.toString, p(5).trim.toDouble,
p(6).trim.toDouble))

tabla.registerTempTable("trx_u3m")

 

Now my problema i show can i transform string variable into date variables
(fechau3m)?

 

Franco Barrientos
Data Scientist

Málaga #115, Of. 1003, Las Condes.
Santiago, Chile.
(+562)-29699649
(+569)-76347893

 <ma...@exalitica.com> franco.barrientos@exalitica.com 

 <http://www.exalitica.com/> www.exalitica.com


  <http://exalitica.com/web/img/frim.png> 

 


Re: S3 table to spark sql

Posted by Rishi Yadav <ri...@infoobjects.com>.
simple

scala> val date = new
java.text.SimpleDateFormat("yyyymmdd").parse(fechau3m)

should work. Replace "yyyymmdd" with the format fechau3m is in.

If you want to do it at case class level:

val sqlContext = new org.apache.spark.sql.hive.HiveContext(sc)
//HiveContext always a good idea

import sqlContext.createSchemaRDD



case class trx_u3m(id: String, local: String, fechau3m: java.util.Date,
rubro: Int, sku: String, unidades: Double, monto: Double)



val tabla = sc.textFile("s3n://exalitica.com/trx_u3m/trx_u3m.txt").map(_.split(",")).map(p
=> trx_u3m(p(0).trim.toString, p(1).trim.toString, new
java.text.SimpleDateFormat("yyyymmdd").parse(p(2).trim.toString),
p(3).trim.toInt, p(4).trim.toString, p(5).trim.toDouble,
p(6).trim.toDouble))

tabla.registerTempTable("trx_u3m")


On Tue, Nov 11, 2014 at 11:11 AM, Franco Barrientos <
franco.barrientos@exalitica.com> wrote:

> How can i create a date field in spark sql? I have a S3 table and  i load
> it into a RDD.
>
>
>
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>
> import sqlContext.createSchemaRDD
>
>
>
> case class trx_u3m(id: String, local: String, fechau3m: String, rubro:
> Int, sku: String, unidades: Double, monto: Double)
>
>
>
> val tabla = sc.textFile("s3n://exalitica.com/trx_u3m/trx_u3m.txt").map(_.split(",")).map(p
> => trx_u3m(p(0).trim.toString, p(1).trim.toString, p(2).trim.toString,
> p(3).trim.toInt, p(4).trim.toString, p(5).trim.toDouble,
> p(6).trim.toDouble))
>
> tabla.registerTempTable("trx_u3m")
>
>
>
> Now my problema i show can i transform string variable into date variables
> (fechau3m)?
>
>
>
> *Franco Barrientos*
> Data Scientist
>
> Málaga #115, Of. 1003, Las Condes.
> Santiago, Chile.
> (+562)-29699649
> (+569)-76347893
>
> franco.barrientos@exalitica.com
>
> www.exalitica.com
>
> [image: http://exalitica.com/web/img/frim.png]
>
>
>