You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiao Li (JIRA)" <ji...@apache.org> on 2017/08/11 19:06:00 UTC

[jira] [Resolved] (SPARK-21519) Add an option to the JDBC data source to initialize the environment of the remote database session

     [ https://issues.apache.org/jira/browse/SPARK-21519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiao Li resolved SPARK-21519.
-----------------------------
       Resolution: Fixed
    Fix Version/s: 2.3.0

> Add an option to the JDBC data source to initialize the environment of the remote database session
> --------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21519
>                 URL: https://issues.apache.org/jira/browse/SPARK-21519
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.1.0, 2.1.1, 2.2.0
>            Reporter: Luca Canali
>            Assignee: Luca Canali
>            Priority: Minor
>             Fix For: 2.3.0
>
>
> This proposes an option to the JDBC datasource, tentatively called " sessionInitStatement" to implement the functionality of session initialization present for example in the Sqoop connector for Oracle (see https://sqoop.apache.org/docs/1.4.6/SqoopUserGuide.html#_oraoop_oracle_session_initialization_statements ) . After each database session is opened to the remote DB, and before starting to read data, this option executes a custom SQL statement (or a PL/SQL block in the case of Oracle).
> Example of usage, relevant to Oracle JDBC:
> {code}
> val preambleSQL="""
> begin 
>   execute immediate 'alter session set tracefile_identifier=sparkora'; 
>   execute immediate 'alter session set "_serial_direct_read"=true';
>   execute immediate 'alter session set time_zone=''+02:00''';
> end;
> """
> bin/spark-shell –jars ojdb6.jar
> val df = spark.read.format("jdbc").option("url", "jdbc:oracle:thin:@ORACLEDBSERVER:1521/service_name").option("driver", "oracle.jdbc.driver.OracleDriver").option("dbtable", "(select 1, sysdate, systimestamp, current_timestamp, localtimestamp from dual)").option("user", "MYUSER").option("password", "MYPASSWORD").option("fetchsize",1000).option("sessionInitStatement", preambleSQL).load()
> df.show(5,false)
> {code}
> Comments: This proposal has been developed and tested for connecting the Spark JDBC data source to Oracle databases, however I believe it can be useful for other target DBs too, as it is quite generic.  
> Note the proposed code allows to inject SQL into the target database. This is not a security concern as such, as it requires password authentication, however beware of the possibilities of injecting user-provided SQL (and PL/SQL) that this opens.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org