You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2021/09/12 02:00:00 UTC

[jira] [Commented] (SPARK-36720) On overwrite mode, setting option truncate as true doesn't truncate the table

    [ https://issues.apache.org/jira/browse/SPARK-36720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413639#comment-17413639 ] 

Hyukjin Kwon commented on SPARK-36720:
--------------------------------------

The error is from:

{quote}
com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [258]: insufficient privilege: Detailed info for this error can be found with guid 'xxxx'
{quote}

Mind elabourating why is it an issue in PySpark or Apache Spark?

> On overwrite mode, setting option truncate as true doesn't truncate the table
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-36720
>                 URL: https://issues.apache.org/jira/browse/SPARK-36720
>             Project: Spark
>          Issue Type: Bug
>          Components: PySpark
>    Affects Versions: 3.1.1
>            Reporter: Balaji Balasubramaniam
>            Priority: Major
>
> I'm using PySpark from AWS Glue job to write it to SAP HANA using jdbc. Our requirement is to truncate and load data in HANA.
> I've tried both of these options and on both cases, based on the stack trace, it is trying to drop the table which is not allowed by security design.
> #df_lake.write.format("jdbc").option("url", edw_jdbc_url).option("driver", "com.sap.db.jdbc.Driver").option("dbtable", edw_jdbc_db_table).option("user", edw_jdbc_userid).option("password", edw_jdbc_password).option("truncate", "true").mode("append").save()
>  properties=\{"user": edw_jdbc_userid, "password": edw_jdbc_password, "truncate":"true"}
> df_lake.write.jdbc(url=edw_jdbc_url, table=edw_jdbc_db_table, mode='overwrite', properties=properties)
>  
> I've verified that the schema matches. I did the jdbc read and print out the schema as well as printing the schema from the source table.
> Schema from HANA:
> root
>  |-- RTL_ACCT_ID: long (nullable = true)
>  |-- FINE_DINING_PROPOSED: string (nullable = true)
>  |-- FINE_WINE_PROPOSED: string (nullable = true)
>  |-- FINE_WINE_INF_PROPOSED: string (nullable = true)
>  |-- GOLD_SILVER_PROPOSED: string (nullable = true)
>  |-- PREMIUM_PROPOSED: string (nullable = true)
>  |-- GSP_PROPOSED: string (nullable = true)
>  |-- PROPOSED_CRAFT: string (nullable = true)
>  |-- FW_REASON: string (nullable = true)
>  |-- FWI_REASON: string (nullable = true)
>  |-- GS_REASON: string (nullable = true)
>  |-- PREM_REASON: string (nullable = true)
>  |-- FD_REASON: string (nullable = true)
>  |-- CRAFT_REASON: string (nullable = true)
>  |-- GSP_FLAG: string (nullable = true)
>  |-- GSP_REASON: string (nullable = true)
>  |-- ELIGIBILITY: string (nullable = true)
>  |-- DW_LD_S: timestamp (nullable = true)
> Schema from the source table: 
> root
>  |-- RTL_ACCT_ID: long (nullable = true)
>  |-- FINE_DINING_PROPOSED: string (nullable = true)
>  |-- FINE_WINE_PROPOSED: string (nullable = true)
>  |-- FINE_WINE_INF_PROPOSED: string (nullable = true)
>  |-- GOLD_SILVER_PROPOSED: string (nullable = true)
>  |-- PREMIUM_PROPOSED: string (nullable = true)
>  |-- GSP_PROPOSED: string (nullable = true)
>  |-- PROPOSED_CRAFT: string (nullable = true)
>  |-- FW_REASON: string (nullable = true)
>  |-- FWI_REASON: string (nullable = true)
>  |-- GS_REASON: string (nullable = true)
>  |-- PREM_REASON: string (nullable = true)
>  |-- FD_REASON: string (nullable = true)
>  |-- CRAFT_REASON: string (nullable = true)
>  |-- GSP_FLAG: string (nullable = true)
>  |-- GSP_REASON: string (nullable = true)
>  |-- ELIGIBILITY: string (nullable = true)
>  |-- DW_LD_S: timestamp (nullable = true)
> This is the stack trace
> py4j.protocol.Py4JJavaError: An error occurred while calling o169.jdbc.
> : com.sap.db.jdbc.exceptions.JDBCDriverException: SAP DBTech JDBC: [258]: insufficient privilege: Detailed info for this error can be found with guid 'xxxx'
> 	at com.sap.db.jdbc.exceptions.SQLExceptionSapDB._newInstance(SQLExceptionSapDB.java:191)
> 	at com.sap.db.jdbc.exceptions.SQLExceptionSapDB.newInstance(SQLExceptionSapDB.java:42)
> 	at com.sap.db.jdbc.packet.HReplyPacket._buildExceptionChain(HReplyPacket.java:976)
> 	at com.sap.db.jdbc.packet.HReplyPacket.getSQLExceptionChain(HReplyPacket.java:157)
> 	at com.sap.db.jdbc.packet.HPartInfo.getSQLExceptionChain(HPartInfo.java:39)
> 	at com.sap.db.jdbc.ConnectionSapDB._receive(ConnectionSapDB.java:3476)
> 	at com.sap.db.jdbc.ConnectionSapDB.exchange(ConnectionSapDB.java:1568)
> 	at com.sap.db.jdbc.StatementSapDB._executeDirect(StatementSapDB.java:1435)
> 	at com.sap.db.jdbc.StatementSapDB._execute(StatementSapDB.java:1414)
> 	at com.sap.db.jdbc.StatementSapDB._execute(StatementSapDB.java:1399)
> 	at com.sap.db.jdbc.StatementSapDB._executeUpdate(StatementSapDB.java:1387)
> 	at com.sap.db.jdbc.StatementSapDB.executeUpdate(StatementSapDB.java:175)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.executeStatement(JdbcUtils.scala:993)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.dropTable(JdbcUtils.scala:93)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:61)
> 	at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
> 	at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:90)
> 	at org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:185)
> 	at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:223)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
> 	at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:220)
> 	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:181)
> 	at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:134)
> 	at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:133)
> 	at org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:989)
> 	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
> 	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
> 	at org.apache.spark.sql.execution.SQLExecution$.executeQuery$1(SQLExecution.scala:110)
> 	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:135)
> 	at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
> 	at org.apache.spark.sql.execution.SQLExecution$.withTracker(SQLExecution.scala:232)
> 	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:135)
> 	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:253)
> 	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:134)
> 	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
> 	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:68)
> 	at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:989)
> 	at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:438)
> 	at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:415)
> 	at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:301)
> 	at org.apache.spark.sql.DataFrameWriter.jdbc(DataFrameWriter.scala:817)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
> 	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
> 	at py4j.Gateway.invoke(Gateway.java:282)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:238)
> 	at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org