You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Marco Colombo <in...@gmail.com> on 2016/07/25 14:14:42 UTC

jdbcRRD and dataframe

Hi all,

I was using JdbcRRD and signature for constructure was accepting a
function to get a DB connection. This is very useful to provide my own
connection handler.

I'm valuating to move to daraframe, but I cannot how to provide such
function and migrate my code. I want to use my own 'getConnection'
rather than provide connection details.

JdbcRDD(SparkContext sc,
       scala.Function0<java.sql.Connection> getConnection,
       .....,
to
 val df: DataFrame = hiveSqlContext.read.format("jdbc").options(options).load();

How this can be achieved?

Thanks!

Re: jdbcRRD and dataframe

Posted by Marco Colombo <in...@gmail.com>.
Thanks, I would submit an improvement

Il lunedì 25 luglio 2016, Mich Talebzadeh <mi...@gmail.com> ha
scritto:

> I don't think there is.
>
> it would be a viable request using collection pool through DF to connect
> to an RDBMS
>
> cheers
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 25 July 2016 at 16:07, Marco Colombo <ing.marco.colombo@gmail.com
> <javascript:_e(%7B%7D,'cvml','ing.marco.colombo@gmail.com');>> wrote:
>
>> From getConnection I'm handling a connection pool.
>> I see no option for that in docs
>>
>> Regards
>>
>>
>> Il lunedì 25 luglio 2016, Mich Talebzadeh <mich.talebzadeh@gmail.com
>> <javascript:_e(%7B%7D,'cvml','mich.talebzadeh@gmail.com');>> ha scritto:
>>
>>> Hi Marco,
>>>
>>> what is in your UDF getConnection and why not use DF itself?
>>>
>>> I guess it is all connection attributes
>>>
>>> val c = HiveContext.load("jdbc",
>>> Map("url" -> _ORACLEserver,
>>> "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC
>>> FROM sh.channels)",
>>> "user" -> _username,
>>> "password" -> _password))
>>>
>>> HTH
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>> On 25 July 2016 at 15:14, Marco Colombo <in...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> Hi all,
>>>>
>>>> I was using JdbcRRD and signature for constructure was accepting a function to get a DB connection. This is very useful to provide my own connection handler.
>>>>
>>>> I'm valuating to move to daraframe, but I cannot how to provide such function and migrate my code. I want to use my own 'getConnection' rather than provide connection details.
>>>>
>>>> JdbcRDD(SparkContext sc,
>>>>        scala.Function0<java.sql.Connection> getConnection,
>>>>        .....,
>>>> to
>>>>  val df: DataFrame = hiveSqlContext.read.format("jdbc").options(options).load();
>>>>
>>>> How this can be achieved?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>
>>
>> --
>> Ing. Marco Colombo
>>
>
>

-- 
Ing. Marco Colombo

Re: jdbcRRD and dataframe

Posted by Mich Talebzadeh <mi...@gmail.com>.
I don't think there is.

it would be a viable request using collection pool through DF to connect to
an RDBMS

cheers

Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 25 July 2016 at 16:07, Marco Colombo <in...@gmail.com> wrote:

> From getConnection I'm handling a connection pool.
> I see no option for that in docs
>
> Regards
>
>
> Il lunedì 25 luglio 2016, Mich Talebzadeh <mi...@gmail.com> ha
> scritto:
>
>> Hi Marco,
>>
>> what is in your UDF getConnection and why not use DF itself?
>>
>> I guess it is all connection attributes
>>
>> val c = HiveContext.load("jdbc",
>> Map("url" -> _ORACLEserver,
>> "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC
>> FROM sh.channels)",
>> "user" -> _username,
>> "password" -> _password))
>>
>> HTH
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any loss, damage or destruction of data or any other property which may
>> arise from relying on this email's technical content is explicitly
>> disclaimed. The author will in no case be liable for any monetary damages
>> arising from such loss, damage or destruction.
>>
>>
>>
>> On 25 July 2016 at 15:14, Marco Colombo <in...@gmail.com>
>> wrote:
>>
>>>
>>> Hi all,
>>>
>>> I was using JdbcRRD and signature for constructure was accepting a function to get a DB connection. This is very useful to provide my own connection handler.
>>>
>>> I'm valuating to move to daraframe, but I cannot how to provide such function and migrate my code. I want to use my own 'getConnection' rather than provide connection details.
>>>
>>> JdbcRDD(SparkContext sc,
>>>        scala.Function0<java.sql.Connection> getConnection,
>>>        .....,
>>> to
>>>  val df: DataFrame = hiveSqlContext.read.format("jdbc").options(options).load();
>>>
>>> How this can be achieved?
>>>
>>> Thanks!
>>>
>>>
>>
>
> --
> Ing. Marco Colombo
>

Re: jdbcRRD and dataframe

Posted by Marco Colombo <in...@gmail.com>.
From getConnection I'm handling a connection pool.
I see no option for that in docs

Regards

Il lunedì 25 luglio 2016, Mich Talebzadeh <mi...@gmail.com> ha
scritto:

> Hi Marco,
>
> what is in your UDF getConnection and why not use DF itself?
>
> I guess it is all connection attributes
>
> val c = HiveContext.load("jdbc",
> Map("url" -> _ORACLEserver,
> "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC FROM
> sh.channels)",
> "user" -> _username,
> "password" -> _password))
>
> HTH
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>
>
>
> On 25 July 2016 at 15:14, Marco Colombo <ing.marco.colombo@gmail.com
> <javascript:_e(%7B%7D,'cvml','ing.marco.colombo@gmail.com');>> wrote:
>
>>
>> Hi all,
>>
>> I was using JdbcRRD and signature for constructure was accepting a function to get a DB connection. This is very useful to provide my own connection handler.
>>
>> I'm valuating to move to daraframe, but I cannot how to provide such function and migrate my code. I want to use my own 'getConnection' rather than provide connection details.
>>
>> JdbcRDD(SparkContext sc,
>>        scala.Function0<java.sql.Connection> getConnection,
>>        .....,
>> to
>>  val df: DataFrame = hiveSqlContext.read.format("jdbc").options(options).load();
>>
>> How this can be achieved?
>>
>> Thanks!
>>
>>
>

-- 
Ing. Marco Colombo

Re: jdbcRRD and dataframe

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Marco,

what is in your UDF getConnection and why not use DF itself?

I guess it is all connection attributes

val c = HiveContext.load("jdbc",
Map("url" -> _ORACLEserver,
"dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC FROM
sh.channels)",
"user" -> _username,
"password" -> _password))

HTH


Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 25 July 2016 at 15:14, Marco Colombo <in...@gmail.com> wrote:

>
> Hi all,
>
> I was using JdbcRRD and signature for constructure was accepting a function to get a DB connection. This is very useful to provide my own connection handler.
>
> I'm valuating to move to daraframe, but I cannot how to provide such function and migrate my code. I want to use my own 'getConnection' rather than provide connection details.
>
> JdbcRDD(SparkContext sc,
>        scala.Function0<java.sql.Connection> getConnection,
>        .....,
> to
>  val df: DataFrame = hiveSqlContext.read.format("jdbc").options(options).load();
>
> How this can be achieved?
>
> Thanks!
>
>