You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by nsareen <ns...@gmail.com> on 2014/11/10 08:39:07 UTC

Efficient Key Structure in pairRDD

Hi,

We are trying to adopt Spark for our application.

We have an analytical application which stores data in Star Schemas ( SQL
Server ). All the cubes are loaded into a Key / Value structure and saved in
Trove ( in memory collection ). here key is a short array where each short
number represents a dimension member. 
e.g
Tuple = CampaignX,Product1,Region_south,10.23232 gets converted to 
Trove Key[[12322],[45232],[53421]] & Value[10.23232].

This is done to avoid saving collection of string objects as key in Trove.

Now can we save this data structure in Spark using pairRDD? & if yes, will
key value be an ideal way of storing data in spark and retrieving it for
data analysis, or is there any other better data structure we can  create,
which would help us create and process RDD ?

Nitin.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Efficient-Key-Structure-in-pairRDD-tp18461.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: Efficient Key Structure in pairRDD

Posted by nsareen <ns...@gmail.com>.

Spark Dev / Users, help in this regard would be appreciated, we  are kind of
stuck at this point.



--
View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Efficient-Key-Structure-in-pairRDD-tp18461p18557.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org