You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Ed Berezitsky (JIRA)" <ji...@apache.org> on 2018/05/02 16:11:00 UTC

[jira] [Commented] (NIFI-5044) SelectHiveQL accept only one statement

    [ https://issues.apache.org/jira/browse/NIFI-5044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461241#comment-16461241 ] 

Ed Berezitsky commented on NIFI-5044:
-------------------------------------

[~mattyb149],

I think we should do that regardless the attributes having EL or not.

But here are more scenarios. This processor's having "INPUT ALLOWED". So it can start the flow, and there won't be incoming flow files. In this case, there is nothing we can do, just post error into bulletin.

Now, the processor creates one OR MORE flow files (depending on amount of records, and "Max Rows Per Flow File" param). Flow files are being cached until all data is collected. Only after that all the new flow files are going to success relationship. In case we are failing on post queries - we either need to forward all to failure, or entire data set will be discarded (if we rollback). I can iterate over all the flowfiles with data, and add an attribute with error cause to each of them before sending all to failure.

Can you comment please?

Thanks.

> SelectHiveQL accept only one statement
> --------------------------------------
>
>                 Key: NIFI-5044
>                 URL: https://issues.apache.org/jira/browse/NIFI-5044
>             Project: Apache NiFi
>          Issue Type: Improvement
>    Affects Versions: 1.2.0
>            Reporter: Davide Isoardi
>            Assignee: Ed Berezitsky
>            Priority: Critical
>
> In [this |[https://github.com/apache/nifi/commit/bbc714e73ba245de7bc32fd9958667c847101f7d] ] commit claims to add support to running multiple statements both on SelectHiveQL and PutHiveQL; instead, it adds only the support to PutHiveQL, so SelectHiveQL still lacks this important feature. @Matt Burgess, I saw that you worked on that, is there any reason for this? If not, can we support it?
> If I try to execute this query:
> {quote}set hive.vectorized.execution.enabled = false; SELECT * FROM table_name
> {quote}
> I have this error:
>  
> {quote}2018-04-05 13:35:40,572 ERROR [Timer-Driven Process Thread-146] o.a.nifi.processors.hive.SelectHiveQL SelectHiveQL[id=243d4c17-b1fe-14af-ffff-ffffee8ce15e] Unable to execute HiveQL select query set hive.vectorized.execution.enabled = false; SELECT * FROM table_name for StandardFlowFileRecord[uuid=0e035558-07ce-473b-b0d4-ac00b8b1df93,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1522824912161-2753, container=default, section=705], offset=838441, length=25],offset=0,name=cliente_attributi.csv,size=25] due to org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: The query did not generate a result set!; routing to failure: {}
>  org.apache.nifi.processor.exception.ProcessException: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:305)
>  at org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2529)
>  at org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:275)
>  at org.apache.nifi.processors.hive.SelectHiveQL.lambda$onTrigger$0(SelectHiveQL.java:215)
>  at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:114)
>  at org.apache.nifi.processor.util.pattern.PartialFunctions.onTrigger(PartialFunctions.java:106)
>  at org.apache.nifi.processors.hive.SelectHiveQL.onTrigger(SelectHiveQL.java:215)
>  at org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1120)
>  at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:147)
>  at org.apache.nifi.controller.tasks.ContinuallyRunProcessorTask.call(ContinuallyRunProcessorTask.java:47)
>  at org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:132)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
>  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
>  at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
>  Caused by: java.sql.SQLException: The query did not generate a result set!
>  at org.apache.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:438)
>  at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at org.apache.commons.dbcp.DelegatingStatement.executeQuery(DelegatingStatement.java:208)
>  at org.apache.nifi.processors.hive.SelectHiveQL$2.process(SelectHiveQL.java:293)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)