You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Martin (Jira)" <ji...@apache.org> on 2020/03/12 09:48:00 UTC
[jira] [Comment Edited] (NIFI-7247) Unable to execute SQL
[ https://issues.apache.org/jira/browse/NIFI-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057763#comment-17057763 ]
Martin edited comment on NIFI-7247 at 3/12/20, 9:47 AM:
--------------------------------------------------------
Here again the order:
# Everything created, controller service, processors enabled.
# First execution failed, just like every other one.
# Now disable Controller Service and enable it again.
# ExecuteSQL runs through cleanly. The next and the next but one execution are much faster.
# After some time we are back to step 2, that the processor fails.
was (Author: maebert):
Here again the order:
# Everything created, controller service, processors enabled.
# First execution failed, just like every other one.
# Now disable Controller Service and enable it again.
# ExecuteSQL runs through cleanly. The next and the next but one execution are much faster.
# After some time we are back to step 2, that the processor failed.
> Unable to execute SQL
> ---------------------
>
> Key: NIFI-7247
> URL: https://issues.apache.org/jira/browse/NIFI-7247
> Project: Apache NiFi
> Issue Type: Bug
> Components: Core Framework
> Affects Versions: 1.11.3
> Environment: containerized environment on EC2 (amzn2-ami-hvm-2.0.20191116.0-x86_64-gp2)
> Reporter: Martin
> Priority: Major
> Labels: databricks, delta, jdbc, sql
> Fix For: 1.11.4
>
> Attachments: Error in UI.jpg, flow_as_template.xml, nifi-app.log
>
>
> Scenario:
> We use ExecuteSQL to read delta tables (stored in S3) via JDBC connection to databricks.
>
> Temporary Fix:
> If we deactivate and reactivate the controller service, then ExecuteSQL works without problems. What is noticeable here, however, is that it takes quite a long time the first time it is executed and the next time it is executed it is done within 3 seconds.
>
> Background information:
> * Howto use Databricks JDBC [https://docs.databricks.com/integrations/bi/jdbc-odbc-bi.html]
> * Controller Service DBCPConnectionPool 1.11.3
> **
> URL: jdbc:spark://#\{databricks.host}...\{databricks.cluster.id};...;PWD=#\{databricks.token}
> Driver Class: com.simba.spark.jdbc.Driver
> * Table
> ** one column with <20 entries
> ** Created By Spark 2.4.4
> ** Type MANAGED
> ** Provider delta
> ** Location s3
> ** Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> ** InputFormat org.apache.hadoop.mapred.SequenceFileInputFormat
> ** OutputFormat org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
> * SQL
> ** SELECT * FROM "${db.table.schema}"."${db.table.name}"
> ** output <20 entries
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)