You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Martin (Jira)" <ji...@apache.org> on 2020/03/12 09:48:00 UTC
[jira] [Comment Edited] (NIFI-7247) Unable to execute SQL

    [ https://issues.apache.org/jira/browse/NIFI-7247?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057763#comment-17057763 ] 

Martin edited comment on NIFI-7247 at 3/12/20, 9:47 AM:
--------------------------------------------------------

Here again the order:
 # Everything created, controller service, processors enabled.
 # First execution failed, just like every other one.
 # Now disable Controller Service and enable it again.
 # ExecuteSQL runs through cleanly. The next and the next but one execution are much faster.
 # After some time we are back to step 2, that the processor fails.


was (Author: maebert):
Here again the order:
 # Everything created, controller service, processors enabled.
 # First execution failed, just like every other one.
 # Now disable Controller Service and enable it again.
 # ExecuteSQL runs through cleanly. The next and the next but one execution are much faster.
 # After some time we are back to step 2, that the processor failed.

> Unable to execute SQL
> ---------------------
>
>                 Key: NIFI-7247
>                 URL: https://issues.apache.org/jira/browse/NIFI-7247
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.11.3
>         Environment: containerized environment on EC2 (amzn2-ami-hvm-2.0.20191116.0-x86_64-gp2)
>            Reporter: Martin
>            Priority: Major
>              Labels: databricks, delta, jdbc, sql
>             Fix For: 1.11.4
>
>         Attachments: Error in UI.jpg, flow_as_template.xml, nifi-app.log
>
>
> Scenario:
> We use ExecuteSQL to read delta tables (stored in S3) via JDBC connection to databricks.
>  
> Temporary Fix:
> If we deactivate and reactivate the controller service, then ExecuteSQL works without problems. What is noticeable here, however, is that it takes quite a long time the first time it is executed and the next time it is executed it is done within 3 seconds.
>  
> Background information:
>  * Howto use Databricks JDBC [https://docs.databricks.com/integrations/bi/jdbc-odbc-bi.html]
>  * Controller Service DBCPConnectionPool 1.11.3
>  ** 
> URL: jdbc:spark://#\{databricks.host}...\{databricks.cluster.id};...;PWD=#\{databricks.token}
> Driver Class: com.simba.spark.jdbc.Driver
>  * Table
>  ** one column with <20 entries
>  ** Created By Spark 2.4.4
>  ** Type MANAGED
>  ** Provider delta
>  ** Location s3
>  ** Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
>  ** InputFormat org.apache.hadoop.mapred.SequenceFileInputFormat
>  ** OutputFormat org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
>  * SQL 
>  ** SELECT * FROM "${db.table.schema}"."${db.table.name}"
>  ** output <20 entries
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)