You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "javacaoyu@163.com" <ja...@163.com> on 2022/09/19 09:52:38 UTC

[how to]RDD using JDBC data source in PySpark

Hi guys:

    Does have some way to let rdd can using jdbc data source in pyspark?

    i want to get data from mysql, but in PySpark, there is not supported JDBCRDD like java/scala.
    and i search docs from web site, no answer.
 

    So i need your guys help,  Thank you very much.



javacaoyu@163.com

Re: Re: [how to]RDD using JDBC data source in PySpark

Posted by Bjørn Jørgensen <bj...@gmail.com>.
There is a PR for this now. [SPARK-40491][SQL] Expose a jdbcRDD function in
SparkContext <https://github.com/apache/spark/pull/37937>

man. 19. sep. 2022 kl. 12:47 skrev javacaoyu@163.com <ja...@163.com>:

> Thank you Bjorn Jorgensen and also thank to Sean Owen.
>
> DataFrame and .format("jdbc") is good way to resolved it.
> But in some reasons, i can't using DataFrame API, only can use RDD API in
> PySpark.
> ...T_T...
>
> thanks all you guys help.  but still need new idea to resolve it.     XD
>
>
>
>
>
> ------------------------------
> javacaoyu@163.com
>
>
> *发件人:* Bjørn Jørgensen <bj...@gmail.com>
> *发送时间:* 2022-09-19 18:34
> *收件人:* javacaoyu@163.com
> *抄送:* Xiao, Alton <al...@sap.com.invalid>; user@spark.apache.org
> *主题:* Re: 答复: [how to]RDD using JDBC data source in PySpark
> https://www.projectpro.io/recipes/save-dataframe-mysql-pyspark
> and
> https://towardsdatascience.com/pyspark-mysql-tutorial-fa3f7c26dc7
>
> man. 19. sep. 2022 kl. 12:29 skrev javacaoyu@163.com <ja...@163.com>:
>
>> Thank you answer alton.
>>
>> But i see that is use scala to implement it.
>> I know java/scala can get data from mysql using JDBCRDD farily well.
>> But i want to get same way in Python Spark.
>>
>> Would you to give me more advice, very thanks to you.
>>
>>
>> ------------------------------
>> javacaoyu@163.com
>>
>>
>> *发件人:* Xiao, Alton <al...@sap.com.INVALID>
>> *发送时间:* 2022-09-19 18:04
>> *收件人:* javacaoyu@163.com; user@spark.apache.org
>> *主题:* 答复: [how to]RDD using JDBC data source in PySpark
>>
>> Hi javacaoyu:
>>
>> https://hevodata.com/learn/spark-mysql/#Spark-MySQL-Integration
>>
>> I think spark have already integrated mysql
>>
>>
>>
>> *发件人**:* javacaoyu@163.com <ja...@163.com>
>> *日期**:* 星期一, 2022年9月19日 17:53
>> *收件人**:* user@spark.apache.org <us...@spark.apache.org>
>> *主题**:* [how to]RDD using JDBC data source in PySpark
>>
>> 你通常不会收到来自 javacaoyu@163.com 的电子邮件。了解这一点为什么很重要
>> <https://aka.ms/LearnAboutSenderIdentification>
>>
>> Hi guys:
>>
>>
>>
>>     Does have some way to let rdd can using jdbc data source in pyspark?
>>
>>
>>
>>     i want to get data from mysql, but in PySpark, there is not supported
>> JDBCRDD like java/scala.
>>
>>     and i search docs from web site, no answer.
>>
>>
>>
>>
>>
>>     So i need your guys help,  Thank you very much.
>>
>>
>> ------------------------------
>>
>> javacaoyu@163.com
>>
>>
>
> --
> Bjørn Jørgensen
> Vestre Aspehaug 4, 6010 Ålesund
> Norge
>
> +47 480 94 297
>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Re: Re: [how to]RDD using JDBC data source in PySpark

Posted by "javacaoyu@163.com" <ja...@163.com>.
Thank you Bjorn Jorgensen and also thank to Sean Owen.

DataFrame and .format("jdbc") is good way to resolved it. 
But in some reasons, i can't using DataFrame API, only can use RDD API in PySpark.
...T_T...

thanks all you guys help.  but still need new idea to resolve it.     XD







javacaoyu@163.com
 
发件人: Bjørn Jørgensen
发送时间: 2022-09-19 18:34
收件人: javacaoyu@163.com
抄送: Xiao, Alton; user@spark.apache.org
主题: Re: 答复: [how to]RDD using JDBC data source in PySpark
https://www.projectpro.io/recipes/save-dataframe-mysql-pyspark
and
https://towardsdatascience.com/pyspark-mysql-tutorial-fa3f7c26dc7

man. 19. sep. 2022 kl. 12:29 skrev javacaoyu@163.com <ja...@163.com>:
Thank you answer alton.

But i see that is use scala to implement it.
I know java/scala can get data from mysql using JDBCRDD farily well.
But i want to get same way in Python Spark.

Would you to give me more advice, very thanks to you.




javacaoyu@163.com
 
发件人: Xiao, Alton
发送时间: 2022-09-19 18:04
收件人: javacaoyu@163.com; user@spark.apache.org
主题: 答复: [how to]RDD using JDBC data source in PySpark
Hi javacaoyu:
https://hevodata.com/learn/spark-mysql/#Spark-MySQL-Integration
I think spark have already integrated mysql
 
发件人: javacaoyu@163.com <ja...@163.com>
日期: 星期一, 2022年9月19日 17:53
收件人: user@spark.apache.org <us...@spark.apache.org>
主题: [how to]RDD using JDBC data source in PySpark
你通常不会收到来自 javacaoyu@163.com 的电子邮件。了解这一点为什么很重要
Hi guys:
 
    Does have some way to let rdd can using jdbc data source in pyspark?
 
    i want to get data from mysql, but in PySpark, there is not supported JDBCRDD like java/scala.
    and i search docs from web site, no answer.
 


    So i need your guys help,  Thank you very much.
 


javacaoyu@163.com


-- 
Bjørn Jørgensen 
Vestre Aspehaug 4, 6010 Ålesund 
Norge

+47 480 94 297

Re: 答复: [how to]RDD using JDBC data source in PySpark

Posted by Bjørn Jørgensen <bj...@gmail.com>.
https://www.projectpro.io/recipes/save-dataframe-mysql-pyspark
and
https://towardsdatascience.com/pyspark-mysql-tutorial-fa3f7c26dc7

man. 19. sep. 2022 kl. 12:29 skrev javacaoyu@163.com <ja...@163.com>:

> Thank you answer alton.
>
> But i see that is use scala to implement it.
> I know java/scala can get data from mysql using JDBCRDD farily well.
> But i want to get same way in Python Spark.
>
> Would you to give me more advice, very thanks to you.
>
>
> ------------------------------
> javacaoyu@163.com
>
>
> *发件人:* Xiao, Alton <al...@sap.com.INVALID>
> *发送时间:* 2022-09-19 18:04
> *收件人:* javacaoyu@163.com; user@spark.apache.org
> *主题:* 答复: [how to]RDD using JDBC data source in PySpark
>
> Hi javacaoyu:
>
> https://hevodata.com/learn/spark-mysql/#Spark-MySQL-Integration
>
> I think spark have already integrated mysql
>
>
>
> *发件人**:* javacaoyu@163.com <ja...@163.com>
> *日期**:* 星期一, 2022年9月19日 17:53
> *收件人**:* user@spark.apache.org <us...@spark.apache.org>
> *主题**:* [how to]RDD using JDBC data source in PySpark
>
> 你通常不会收到来自 javacaoyu@163.com 的电子邮件。了解这一点为什么很重要
> <https://aka.ms/LearnAboutSenderIdentification>
>
> Hi guys:
>
>
>
>     Does have some way to let rdd can using jdbc data source in pyspark?
>
>
>
>     i want to get data from mysql, but in PySpark, there is not supported
> JDBCRDD like java/scala.
>
>     and i search docs from web site, no answer.
>
>
>
>
>
>     So i need your guys help,  Thank you very much.
>
>
> ------------------------------
>
> javacaoyu@163.com
>
>

-- 
Bjørn Jørgensen
Vestre Aspehaug 4, 6010 Ålesund
Norge

+47 480 94 297

Re: 答复: [how to]RDD using JDBC data source in PySpark

Posted by Sean Owen <sr...@gmail.com>.
Just use the .format('jdbc') data source? This is built in, for all
languages. You can get an RDD out if you must.

On Mon, Sep 19, 2022, 5:28 AM javacaoyu@163.com <ja...@163.com> wrote:

> Thank you answer alton.
>
> But i see that is use scala to implement it.
> I know java/scala can get data from mysql using JDBCRDD farily well.
> But i want to get same way in Python Spark.
>
> Would you to give me more advice, very thanks to you.
>
>
> ------------------------------
> javacaoyu@163.com
>
>
> *发件人:* Xiao, Alton <al...@sap.com.INVALID>
> *发送时间:* 2022-09-19 18:04
> *收件人:* javacaoyu@163.com; user@spark.apache.org
> *主题:* 答复: [how to]RDD using JDBC data source in PySpark
>
> Hi javacaoyu:
>
> https://hevodata.com/learn/spark-mysql/#Spark-MySQL-Integration
>
> I think spark have already integrated mysql
>
>
>
> *发件人**:* javacaoyu@163.com <ja...@163.com>
> *日期**:* 星期一, 2022年9月19日 17:53
> *收件人**:* user@spark.apache.org <us...@spark.apache.org>
> *主题**:* [how to]RDD using JDBC data source in PySpark
>
> 你通常不会收到来自 javacaoyu@163.com 的电子邮件。了解这一点为什么很重要
> <https://aka.ms/LearnAboutSenderIdentification>
>
> Hi guys:
>
>
>
>     Does have some way to let rdd can using jdbc data source in pyspark?
>
>
>
>     i want to get data from mysql, but in PySpark, there is not supported
> JDBCRDD like java/scala.
>
>     and i search docs from web site, no answer.
>
>
>
>
>
>     So i need your guys help,  Thank you very much.
>
>
> ------------------------------
>
> javacaoyu@163.com
>
>

回复: 答复: [how to]RDD using JDBC data source in PySpark

Posted by "javacaoyu@163.com" <ja...@163.com>.
Thank you answer alton.

But i see that is use scala to implement it.
I know java/scala can get data from mysql using JDBCRDD farily well.
But i want to get same way in Python Spark.

Would you to give me more advice, very thanks to you.




javacaoyu@163.com
 
发件人: Xiao, Alton
发送时间: 2022-09-19 18:04
收件人: javacaoyu@163.com; user@spark.apache.org
主题: 答复: [how to]RDD using JDBC data source in PySpark
Hi javacaoyu:
https://hevodata.com/learn/spark-mysql/#Spark-MySQL-Integration
I think spark have already integrated mysql
 
发件人: javacaoyu@163.com <ja...@163.com>
日期: 星期一, 2022年9月19日 17:53
收件人: user@spark.apache.org <us...@spark.apache.org>
主题: [how to]RDD using JDBC data source in PySpark
你通常不会收到来自 javacaoyu@163.com 的电子邮件。了解这一点为什么很重要
Hi guys:
 
    Does have some way to let rdd can using jdbc data source in pyspark?
 
    i want to get data from mysql, but in PySpark, there is not supported JDBCRDD like java/scala.
    and i search docs from web site, no answer.
 


    So i need your guys help,  Thank you very much.
 


javacaoyu@163.com

答复: [how to]RDD using JDBC data source in PySpark

Posted by "Xiao, Alton" <al...@sap.com.INVALID>.
Hi javacaoyu:
https://hevodata.com/learn/spark-mysql/#Spark-MySQL-Integration
I think spark have already integrated mysql

发件人: javacaoyu@163.com <ja...@163.com>
日期: 星期一, 2022年9月19日 17:53
收件人: user@spark.apache.org <us...@spark.apache.org>
主题: [how to]RDD using JDBC data source in PySpark
你通常不会收到来自 javacaoyu@163.com 的电子邮件。了解这一点为什么很重要<https://aka.ms/LearnAboutSenderIdentification>
Hi guys:

    Does have some way to let rdd can using jdbc data source in pyspark?

    i want to get data from mysql, but in PySpark, there is not supported JDBCRDD like java/scala.
    and i search docs from web site, no answer.



    So i need your guys help,  Thank you very much.

________________________________
javacaoyu@163.com