You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Vignesh Irulappan <vi...@hcl.com> on 2016/05/12 08:12:32 UTC

How to cache Spark Dataframe in Apache ignite

Hello,

I am writing a code to cache RDBMS data using spark SQLContext JDBC connection. Once a Dataframe is created I want to cache that reusltset using apache ignite thereby making other applications to make use of the resultset. Here is the code snippet.

object test
{

  def main(args:Array[String])
  {

      val configuration = new Configuration()
      val config="src/main/scala/config.xml"

      val sparkConf = new SparkConf().setAppName("test").setMaster("local[*]")
      val sc=new SparkContext(sparkConf)
      val sqlContext = new org.apache.spark.sql.SQLContext(sc)
      val sql_dump1=sqlContext.read.format("jdbc").option("url", "jdbc URL").option("driver", "com.mysql.jdbc.Driver").option("dbtable", mysql_table_statement).option("user", "username").option("password", "pass").load()

      val ic = new IgniteContext[Integer, Integer](sc, config)

      val sharedrdd = ic.fromCache("hbase_metadata")

      //How to cache sql_dump1 dataframe

  }
}

Now the question is how to cache a dataframe, IgniteRDD has savepairs method but it accepts key and value as RDD[Integer], but I have a dataframe even if I convert that to RDD i would only be getting RDD[Row]. The savepairs method consisting of RDD of Integer more specific what if I have a string of RDD as value? Is it good to cache dataframe or any other better approach to cache the resultset.

Thanks and Regards,
Vignesh


::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------

The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and other defects.

----------------------------------------------------------------------------------------------------------------------------------------------------

Re: How to cache Spark Dataframe in Apache ignite

Posted by Denis Magda <dm...@gridgain.com>.
Hi,

Replied to you on StackOverflow
http://stackoverflow.com/questions/37180715/how-to-cache-dataframe-in-apache-ignite/37192299#37192299 <http://stackoverflow.com/questions/37180715/how-to-cache-dataframe-in-apache-ignite/37192299#37192299>

> On May 12, 2016, at 11:12 AM, Vignesh Irulappan <vi...@hcl.com> wrote:
> 
> Hello,
>  
> I am writing a code to cache RDBMS data using spark SQLContext JDBC connection. Once a Dataframe is created I want to cache that reusltset using apache ignite thereby making other applications to make use of the resultset. Here is the code snippet.
>  
> object test
> {
>  
>   def main(args:Array[String])
>   {
>  
>       val configuration = new Configuration()
>       val config="src/main/scala/config.xml"
>     
>       val sparkConf = new SparkConf().setAppName("test").setMaster("local[*]")
>       val sc=new SparkContext(sparkConf)
>       val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>       val sql_dump1=sqlContext.read.format("jdbc").option("url", "jdbc URL").option("driver", "com.mysql.jdbc.Driver").option("dbtable", mysql_table_statement).option("user", "username").option("password", "pass").load()
>      
>       val ic = new IgniteContext[Integer, Integer](sc, config)
>       
>       val sharedrdd = ic.fromCache("hbase_metadata")
>       
>       //How to cache sql_dump1 dataframe
>  
>   }
> }
>  
> Now the question is how to cache a dataframe, IgniteRDD has savepairs method but it accepts key and value as RDD[Integer], but I have a dataframe even if I convert that to RDD i would only be getting RDD[Row]. The savepairs method consisting of RDD of Integer more specific what if I have a string of RDD as value? Is it good to cache dataframe or any other better approach to cache the resultset.
>  
> Thanks and Regards,
> Vignesh
> 
> 
> ::DISCLAIMER::
> ----------------------------------------------------------------------------------------------------------------------------------------------------
> The contents of this e-mail and any attachment(s) are confidential and intended for the named recipient(s) only.
> E-mail transmission is not guaranteed to be secure or error-free as information could be intercepted, corrupted, 
> lost, destroyed, arrive late or incomplete, or may contain viruses in transmission. The e mail and its contents 
> (with or without referred errors) shall therefore not attach any liability on the originator or HCL or its affiliates. 
> Views or opinions, if any, presented in this email are solely those of the author and may not necessarily reflect the 
> views or opinions of HCL or its affiliates. Any form of reproduction, dissemination, copying, disclosure, modification, 
> distribution and / or publication of this message without the prior written consent of authorized representative of 
> HCL is strictly prohibited. If you have received this email in error please delete it and notify the sender immediately. 
> Before opening any email and/or attachments, please check them for viruses and other defects.
> ----------------------------------------------------------------------------------------------------------------------------------------------------