You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Daniel Haviv <da...@gmail.com> on 2014/11/23 21:01:43 UTC

Converting a column to a map

Hi,
I have a column in my schemaRDD that is a map but I'm unable to convert it
to a map.. I've tried converting it to a Tuple2[String,String]:
val converted = jsonFiles.map(line=> {
line(10).asInstanceOf[Tuple2[String,String]]})

but I get ClassCastException:
14/11/23 11:51:30 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0
(TID 2, localhost): java.lang.ClassCastException:
org.apache.spark.sql.catalyst.expressions.GenericRow cannot be cast to
scala.Tuple2

And if if convert it to Iterable[String] I can only get the values without
the keys.

What it the correct data type I should convert it to ?

Thanks,
Daniel

Re: Converting a column to a map

Posted by Yanbo <ya...@gmail.com>.
jsonFiles in your code is schemaRDD rather than RDD[Array].
If it is a column in schemaRDD, you can first use Spark SQL query to get a certain column.
Or schemaRDD support some SQL like operation such as select / where can also get specific column.

> 在 2014年11月24日,上午4:01,Daniel Haviv <da...@gmail.com> 写道:
> 
> Hi,
> I have a column in my schemaRDD that is a map but I'm unable to convert it to a map.. I've tried converting it to a Tuple2[String,String]:
> val converted = jsonFiles.map(line=> { line(10).asInstanceOf[Tuple2[String,String]]})
> 
> but I get ClassCastException:
> 14/11/23 11:51:30 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 2, localhost): java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRow cannot be cast to scala.Tuple2
> 
> And if if convert it to Iterable[String] I can only get the values without the keys.
> 
> What it the correct data type I should convert it to ?
> 
> Thanks,
> Daniel

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org