You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Mich Talebzadeh <mi...@peridale.co.uk> on 2016/02/12 11:45:25 UTC

Connection via JDBC to Oracle hangs after count call

Hi,

 

I use the following to connect to Oracle DB from Spark shell 1.5.2

 

spark-shell --master spark://50.140.197.217:7077 --driver-class-path
/home/hduser/jars/ojdbc6.jar

 

in Scala I do

 

scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)

sqlContext: org.apache.spark.sql.SQLContext =
org.apache.spark.sql.SQLContext@f9d4387

 

scala> val channels = sqlContext.read.format("jdbc").options(

     |      Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",

     |      "dbtable" -> "(select * from sh.channels where channel_id =
14)",

     |      "user" -> "sh",

     |       "password" -> "xxxxxxx")).load

channels: org.apache.spark.sql.DataFrame = [CHANNEL_ID: decimal(0,-127),
CHANNEL_DESC: string, CHANNEL_CLASS: string, CHANNEL_CLASS_ID:
decimal(0,-127), CHANNEL_TOTAL: string, CHANNEL_TOTAL_ID: decimal(0,-127)]

 

scala> channels.count()

 

But the latter command keeps hanging?

 

Any ideas appreciated

 

Thanks,

 

Mich Talebzadeh

 

LinkedIn
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABU
rV8Pw>
https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUr
V8Pw

 

 <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com

 

NOTE: The information in this email is proprietary and confidential. This
message is for the designated recipient only, if you are not the intended
recipient, you should destroy it immediately. Any information in this
message shall not be understood as given or endorsed by Peridale Technology
Ltd, its subsidiaries or their employees, unless expressly so stated. It is
the responsibility of the recipient to ensure that this email is virus free,
therefore neither Peridale Technology Ltd, its subsidiaries nor their
employees accept any responsibility.

 

 


Re: Connection via JDBC to Oracle hangs after count call

Posted by Chanh Le <gi...@gmail.com>.
Hi Mich,

If I have a stored procedure in Oracle write like this
SP get Info: 
PKG_ETL.GET_OBJECTS_INFO( 
        p_LAST_UPDATED VARCHAR2, 
        p_OBJECT_TYPE VARCHAR2, 
        p_TABLE OUT SYS_REFCURSOR); 
How to call in Spark because the output is cursor p_TABLE OUT SYS_REFCURSOR.


Thanks.


> On Jul 11, 2016, at 4:18 PM, Mark Vervuurt <m....@gmail.com> wrote:
> 
> Thanks Mich,
> 
> we have got it working using the example here under ;)
> 
> Mark
> 
>> On 11 Jul 2016, at 09:45, Mich Talebzadeh <mich.talebzadeh@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi Mark,
>> 
>> Hm. It should work. This is Spark 1.6.1 on Oracle 12c
>>  
>>  
>> scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
>> HiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@70f446c
>>  
>> scala> var _ORACLEserver : String = "jdbc:oracle:thin:@rhes564:1521:mydb12"
>> _ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12
>>  
>> scala> var _username : String = "sh"
>> _username: String = sh
>>  
>> scala> var _password : String = "xxxx"
>> _password: String = sh
>>  
>> scala> val c = HiveContext.load("jdbc",
>>      | Map("url" -> _ORACLEserver,
>>      | "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC FROM sh.channels)",
>>      | "user" -> _username,
>>      | "password" -> _password))
>> warning: there were 1 deprecation warning(s); re-run with -deprecation for details
>> c: org.apache.spark.sql.DataFrame = [CHANNEL_ID: string, CHANNEL_DESC: string]
>>  
>> scala> c.registerTempTable("t_c")
>>  
>> scala> c.count
>> res2: Long = 5
>>  
>> scala> HiveContext.sql("select * from t_c").collect.foreach(println)
>> [3,Direct Sales]
>> [9,Tele Sales]
>> [5,Catalog]
>> [4,Internet]
>> [2,Partners]
>>  
>> HTH
>> 
>> 
>> Dr Mich Talebzadeh
>>  
>> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>> 
>> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
>>  
>> 
>> On 11 July 2016 at 08:25, Mark Vervuurt <m.a.vervuurt@gmail.com <ma...@gmail.com>> wrote:
>> Hi Mich,
>> 
>> sorry for bothering did you manage to solve your problem? We have a similar problem with Spark 1.5.2 using a JDBC connection with a DataFrame to an Oracle Database.
>> 
>> Thanks,
>> Mark
>> 
>>> On 12 Feb 2016, at 11:45, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
>>> 
>>> Hi,
>>>  
>>> I use the following to connect to Oracle DB from Spark shell 1.5.2
>>>  
>>> spark-shell --master spark://50.140.197.217:7077 <> --driver-class-path /home/hduser/jars/ojdbc6.jar
>>>  
>>> in Scala I do
>>>  
>>> scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>>> sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@f9d4387
>>>  
>>> scala> val channels = sqlContext.read.format("jdbc").options(
>>>      |      Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",
>>>      |      "dbtable" -> "(select * from sh.channels where channel_id = 14)",
>>>      |      "user" -> "sh",
>>>      |       "password" -> "xxxxxxx")).load
>>> channels: org.apache.spark.sql.DataFrame = [CHANNEL_ID: decimal(0,-127), CHANNEL_DESC: string, CHANNEL_CLASS: string, CHANNEL_CLASS_ID: decimal(0,-127), CHANNEL_TOTAL: string, CHANNEL_TOTAL_ID: decimal(0,-127)]
>>>  
>>> scala> channels.count()
>>>  
>>> But the latter command keeps hanging?
>>>  
>>> Any ideas appreciated
>>>  
>>> Thanks,
>>>  
>>> Mich Talebzadeh
>>>  
>>> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>>  
>>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>>  
>>> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility.
>> 
>> Met vriendelijke groet | Best regards,
>> _______________________________________________________________________
>> 
>> Ir. Mark Vervuurt
>> Senior Big Data Scientist | Insights & Data
>> 
>> Capgemini Nederland | Utrecht
>> Tel.: +31 30 6890978 <tel:%2B31%2030%206890978> – Mob.: +31653670390 <tel:%2B31653670390>
>> www.capgemini.com <http://www.capgemini.com/>
>> <apache-spark-certified-175x175.png><HWX_Badges_Cert_Color_Dev.jpeg><image001.gif> People matter, results count.
>> __________________________________________________________________
>> 
>> <image005.gif>
>> 
>> 
> 


Re: Connection via JDBC to Oracle hangs after count call

Posted by Mark Vervuurt <m....@gmail.com>.
Thanks Mich,

we have got it working using the example here under ;)

Mark

> On 11 Jul 2016, at 09:45, Mich Talebzadeh <mi...@gmail.com> wrote:
> 
> Hi Mark,
> 
> Hm. It should work. This is Spark 1.6.1 on Oracle 12c
>  
>  
> scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)
> HiveContext: org.apache.spark.sql.hive.HiveContext = org.apache.spark.sql.hive.HiveContext@70f446c
>  
> scala> var _ORACLEserver : String = "jdbc:oracle:thin:@rhes564:1521:mydb12"
> _ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12
>  
> scala> var _username : String = "sh"
> _username: String = sh
>  
> scala> var _password : String = "xxxx"
> _password: String = sh
>  
> scala> val c = HiveContext.load("jdbc",
>      | Map("url" -> _ORACLEserver,
>      | "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID, CHANNEL_DESC FROM sh.channels)",
>      | "user" -> _username,
>      | "password" -> _password))
> warning: there were 1 deprecation warning(s); re-run with -deprecation for details
> c: org.apache.spark.sql.DataFrame = [CHANNEL_ID: string, CHANNEL_DESC: string]
>  
> scala> c.registerTempTable("t_c")
>  
> scala> c.count
> res2: Long = 5
>  
> scala> HiveContext.sql("select * from t_c").collect.foreach(println)
> [3,Direct Sales]
> [9,Tele Sales]
> [5,Catalog]
> [4,Internet]
> [2,Partners]
>  
> HTH
> 
> 
> Dr Mich Talebzadeh
>  
> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
> 
> Disclaimer: Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction.
>  
> 
> On 11 July 2016 at 08:25, Mark Vervuurt <m.a.vervuurt@gmail.com <ma...@gmail.com>> wrote:
> Hi Mich,
> 
> sorry for bothering did you manage to solve your problem? We have a similar problem with Spark 1.5.2 using a JDBC connection with a DataFrame to an Oracle Database.
> 
> Thanks,
> Mark
> 
>> On 12 Feb 2016, at 11:45, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
>> 
>> Hi,
>>  
>> I use the following to connect to Oracle DB from Spark shell 1.5.2
>>  
>> spark-shell --master spark://50.140.197.217:7077 <> --driver-class-path /home/hduser/jars/ojdbc6.jar
>>  
>> in Scala I do
>>  
>> scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
>> sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@f9d4387
>>  
>> scala> val channels = sqlContext.read.format("jdbc").options(
>>      |      Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",
>>      |      "dbtable" -> "(select * from sh.channels where channel_id = 14)",
>>      |      "user" -> "sh",
>>      |       "password" -> "xxxxxxx")).load
>> channels: org.apache.spark.sql.DataFrame = [CHANNEL_ID: decimal(0,-127), CHANNEL_DESC: string, CHANNEL_CLASS: string, CHANNEL_CLASS_ID: decimal(0,-127), CHANNEL_TOTAL: string, CHANNEL_TOTAL_ID: decimal(0,-127)]
>>  
>> scala> channels.count()
>>  
>> But the latter command keeps hanging?
>>  
>> Any ideas appreciated
>>  
>> Thanks,
>>  
>> Mich Talebzadeh
>>  
>> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>>  
>> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>>  
>> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility.
> 
> Met vriendelijke groet | Best regards,
> _______________________________________________________________________
> 
> Ir. Mark Vervuurt
> Senior Big Data Scientist | Insights & Data
> 
> Capgemini Nederland | Utrecht
> Tel.: +31 30 6890978 <tel:%2B31%2030%206890978> – Mob.: +31653670390 <tel:%2B31653670390>
> www.capgemini.com <http://www.capgemini.com/>
> <apache-spark-certified-175x175.png><HWX_Badges_Cert_Color_Dev.jpeg><image001.gif> People matter, results count.
> __________________________________________________________________
> 
> <image005.gif>
> 
> 


Re: Connection via JDBC to Oracle hangs after count call

Posted by Mich Talebzadeh <mi...@gmail.com>.
Hi Mark,


Hm. It should work. This is Spark 1.6.1 on Oracle 12c





scala> val HiveContext = new org.apache.spark.sql.hive.HiveContext(sc)

HiveContext: org.apache.spark.sql.hive.HiveContext =
org.apache.spark.sql.hive.HiveContext@70f446c



scala> var _ORACLEserver : String = "jdbc:oracle:thin:@rhes564:1521:mydb12"

_ORACLEserver: String = jdbc:oracle:thin:@rhes564:1521:mydb12



scala> var _username : String = "sh"

_username: String = sh



scala> var _password : String = "xxxx"

_password: String = sh



scala> val c = HiveContext.load("jdbc",

     | Map("url" -> _ORACLEserver,

     | "dbtable" -> "(SELECT to_char(CHANNEL_ID) AS CHANNEL_ID,
CHANNEL_DESC FROM sh.channels)",

     | "user" -> _username,

     | "password" -> _password))

warning: there were 1 deprecation warning(s); re-run with -deprecation for
details

c: org.apache.spark.sql.DataFrame = [CHANNEL_ID: string, CHANNEL_DESC:
string]



scala> c.registerTempTable("t_c")



scala> c.count

res2: Long = 5



scala> HiveContext.sql("select * from t_c").collect.foreach(println)

[3,Direct Sales]

[9,Tele Sales]

[5,Catalog]

[4,Internet]

[2,Partners]


HTH


Dr Mich Talebzadeh



LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*



http://talebzadehmich.wordpress.com


*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.



On 11 July 2016 at 08:25, Mark Vervuurt <m....@gmail.com> wrote:

> Hi Mich,
>
> sorry for bothering did you manage to solve your problem? We have a
> similar problem with Spark 1.5.2 using a JDBC connection with a DataFrame
> to an Oracle Database.
>
> Thanks,
> Mark
>
> On 12 Feb 2016, at 11:45, Mich Talebzadeh <mi...@peridale.co.uk> wrote:
>
> Hi,
>
> I use the following to connect to Oracle DB from Spark shell 1.5.2
>
> spark-shell --master spark://50.140.197.217:7077 --driver-class-path
> /home/hduser/jars/ojdbc6.jar
>
> in Scala I do
>
> scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> sqlContext: org.apache.spark.sql.SQLContext =
> org.apache.spark.sql.SQLContext@f9d4387
>
> scala> val channels = sqlContext.read.format("jdbc").options(
>      |      Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",
>      |      "dbtable" -> "(select * from sh.channels where channel_id =
> 14)",
>      |      "user" -> "sh",
>      |       "password" -> "xxxxxxx")).load
> channels: org.apache.spark.sql.DataFrame = [CHANNEL_ID: decimal(0,-127),
> CHANNEL_DESC: string, CHANNEL_CLASS: string, CHANNEL_CLASS_ID:
> decimal(0,-127), CHANNEL_TOTAL: string, CHANNEL_TOTAL_ID: decimal(0,-127)]
>
> *scala> channels.count()*
>
> But the latter command keeps hanging?
>
> Any ideas appreciated
>
> Thanks,
>
> Mich Talebzadeh
>
> LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
> http://talebzadehmich.wordpress.com
>
> NOTE: The information in this email is proprietary and confidential. This
> message is for the designated recipient only, if you are not the intended
> recipient, you should destroy it immediately. Any information in this
> message shall not be understood as given or endorsed by Peridale Technology
> Ltd, its subsidiaries or their employees, unless expressly so stated. It is
> the responsibility of the recipient to ensure that this email is virus
> free, therefore neither Peridale Technology Ltd, its subsidiaries nor their
> employees accept any responsibility.
>
>
> Met vriendelijke groet | Best regards,
> _______________________________________________________________________
>
>
> *Ir. Mark Vervuurt*
> Senior Big Data Scientist | Insights & Data
>
> Capgemini Nederland | Utrecht
> Tel.: +31 30 6890978 – Mob.: +31653670390
> www.capgemini.com
>
>  *People matter, results count.*
>
> __________________________________________________________________
>
>
>

Re: Connection via JDBC to Oracle hangs after count call

Posted by Mark Vervuurt <m....@gmail.com>.
Hi Mich,

sorry for bothering did you manage to solve your problem? We have a similar problem with Spark 1.5.2 using a JDBC connection with a DataFrame to an Oracle Database.

Thanks,
Mark

> On 12 Feb 2016, at 11:45, Mich Talebzadeh <mich@peridale.co.uk <ma...@peridale.co.uk>> wrote:
> 
> Hi,
>  
> I use the following to connect to Oracle DB from Spark shell 1.5.2
>  
> spark-shell --master spark://50.140.197.217:7077 <spark://50.140.197.217:7077> --driver-class-path /home/hduser/jars/ojdbc6.jar
>  
> in Scala I do
>  
> scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@f9d4387
>  
> scala> val channels = sqlContext.read.format("jdbc").options(
>      |      Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb",
>      |      "dbtable" -> "(select * from sh.channels where channel_id = 14)",
>      |      "user" -> "sh",
>      |       "password" -> "xxxxxxx")).load
> channels: org.apache.spark.sql.DataFrame = [CHANNEL_ID: decimal(0,-127), CHANNEL_DESC: string, CHANNEL_CLASS: string, CHANNEL_CLASS_ID: decimal(0,-127), CHANNEL_TOTAL: string, CHANNEL_TOTAL_ID: decimal(0,-127)]
>  
> scala> channels.count()
>  
> But the latter command keeps hanging?
>  
> Any ideas appreciated
>  
> Thanks,
>  
> Mich Talebzadeh
>  
> LinkedIn  https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>
>  
> http://talebzadehmich.wordpress.com <http://talebzadehmich.wordpress.com/>
>  
> NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility.

Met vriendelijke groet | Best regards,
_______________________________________________________________________

Ir. Mark Vervuurt
Senior Big Data Scientist | Insights & Data

Capgemini Nederland | Utrecht
Tel.: +31 30 6890978 – Mob.: +31653670390
www.capgemini.com <http://www.capgemini.com/>
 People matter, results count.
__________________________________________________________________