You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Chaoyu Tang (JIRA)" <ji...@apache.org> on 2016/12/17 00:03:58 UTC

[jira] [Comment Edited] (HIVE-15441) Provide a config to timeout long compiling queries

    [ https://issues.apache.org/jira/browse/HIVE-15441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15755837#comment-15755837 ] 

Chaoyu Tang edited comment on HIVE-15441 at 12/17/16 12:02 AM:
---------------------------------------------------------------

[~thejas] No. HIVE-12431 is only helpful before the query acquires the compilation lock. What I meant is that after the compilation lock is acquired and the query starts compiling, the compilation is hard to be interrupted. The CompilationTimeoutThread provided in this patch calls threadToInterrupt.interrupt() at timeout, but this Thread.interrupt() API is only to set an interrupt flag and it is up to the targeted thread which is doing the compilation to catch this flag and stop the processing (see https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html). I don't think the compilation related code has something like sleep/wait etc which can catch this interrupt flag and throw out InterruptedException to stop the on-going compilation.
HIVE-4924 has added JDBC QueryTimeout, this timeout counts the time for the whole query processing (compilation + execution). That said, HIVE-4924 should have covered the compilation timeout here. If a long compiling query has not finished by its query timeout, it should be stopped through HIVE-4924.
But HIVE-4924 does not work as expected because the interrupt flag set via interrupt() could not be caught (or handled) properly in the target thread processing the query. In HIVE-14799, I took the other approach to interrupt the query processing when time query timeout is reached during the execution time.
Yeah, we do need a way to interrupt a query when it is being compiled, but setting an interrupt flag using Thread.interupt() may not be sufficient to stop the processing.



was (Author: ctang.ma):
[~thejas] No. HIVE-12431 is only helpful before the query acquires the compilation lock. What I meant is that after the compilation lock is acquired and the query starts compiling, the compilation is hard to be interrupted. The CompilationTimeoutThread provided in this patch calls threadToInterrupt.interrupt() at timeout, but this Thread.interrupt() API is only to set an interrupt flag and it is up to the targeted thread which is doing the compilation to catch this flag and stop the processing (see https://docs.oracle.com/javase/tutorial/essential/concurrency/interrupt.html). I don't think the compilation related code has something like sleep/wait etc which can catch this interrupt flag and throw out InterruptedException to stop the on-going compilation.
HIVE-4924 has added JDBC QueryTimeout, this timeout counts the time for the whole query processing (compilation + execution). That said, HIVE-4924 should have covered the compilation timeout here. If a long compiling query has not finished by its query timeout, it should be stopped through HIVE-4924.
But HIVE-4924 does not work as expected because the interrupt flag set via interrupt() could not be caught (or handled) properly in the target thread processing the query. In HIVE-14799, I took the other approach to interrupt the query processing when time query timeout is reached during the execution time.


> Provide a config to timeout long compiling queries
> --------------------------------------------------
>
>                 Key: HIVE-15441
>                 URL: https://issues.apache.org/jira/browse/HIVE-15441
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Planning
>            Reporter: Chao Sun
>            Assignee: Chao Sun
>         Attachments: HIVE-15441.1.patch
>
>
> Sometimes Hive users have long compiling queries which may need to scan thousands or even more partitions (perhaps by accident). The compilation process may take a very long time, especially in {{getInputSummary}} where it need to make NN calls to get info about each input path.
> This is bad because it may block many other queries. Parallel compilation may be useful but still {{getInputSummary}} has a global lock. In this case, it makes sense to provide Hive admin with a config to put a timeout limit for compilation, so that these "bad" queries can be blocked.
> Note https://issues.apache.org/jira/browse/HIVE-12431 also tries to address similar issue. However it cancels those queries that are waiting for the compile lock, which I think is not so useful for our case since the *query under compile is the one to be blamed.*



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)