You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chaoyu Tang <ct...@gmail.com> on 2016/10/05 16:56:18 UTC
Review Request 52559: HIVE-14799: Query operation are not thread safe
during its cancellation
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 3144
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523051#file1523051line3144>
> >
> > nit: needs a better explanation from a user's perspective... user doesn't know what is a driver and why would it be closed is not clear here
rewording the explanation and hope it will be better. feel free to suggest one if you have a good one.
> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 346
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line346>
> >
> > is this called from elsewhere too? the method existed before
It is just a default implementation for the interface CommandProcessor, and not used at all before. To be less confusing, I won't change this init() and the DriverState is default initiated as INITIALIZED.
> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 564
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line564>
> >
> > iirc if you catch InterruptException, in the catch block the interrupt flag would be reset, so this will not return true
yes, the isInterrupt is implemented as
private boolean isInterrupted() {
return Thread.currentThread().isInterrupted() || interrupt;
}
and so far we do not have wait/sleep which can throw InterruptException before any added checkpoints. In addition, the volatile variable interrupt is set when close is called and unset after it exits. So the isInterrupt is basically not affected by InterruptException and there is no concern.
> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1913
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line1913>
> >
> > same thing wrt interrupt state
won't be a concern as commented above
> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2238
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line2238>
> >
> > 1) these states do not appear thread-safe themselves
> > 2) why are so many states needed, by the way?
> > e.g. what difference does it make for termination if it's compiling or running?
1) the driverState is volatile, should be thread safe
2) they corresponds to different driver states since the driver is a stateful object. I remove the "UNINITILIAZED" and initialize DriverState default as INITIALIZED.
We only have state RUNNING which represents the driver is compiling, executing, or both in one call (like run with alreadyCompiled false). No matter the other query is compiling or executing, for close there is no difference and it should wait anyway until the other process finishes or interrupts, so one state RUNNING is sufficient to cover compiling or/and executing cases.
> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1155
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line1155>
> >
> > is it possible to add a lock object for this and add javadoc to make it explicit what it covers? I doesn't look like all ctx accesses are covered.
I changed to use a reentrant lock for all the synchronized calls.
The reason I used the synchronized (this) since it is sufficient in this case and we do not need some advanced features from using the lock object, in addition, there is an existing method like releasePlan(plan) is already the synchronized method.
We do not the locks for ctx access since it has been gurantteed to be accessed by one thread since the close thread is halt in waiting when the query process is accesing this variable.
- Chaoyu
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review151537
-----------------------------------------------------------
On Oct. 5, 2016, 4:56 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 5, 2016, 4:56 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review151537
-----------------------------------------------------------
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 3144)
<https://reviews.apache.org/r/52559/#comment219978>
nit: needs a better explanation from a user's perspective... user doesn't know what is a driver and why would it be closed is not clear here
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 344)
<https://reviews.apache.org/r/52559/#comment219979>
is this called from elsewhere too? the method existed before
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 556)
<https://reviews.apache.org/r/52559/#comment219980>
iirc if you catch InterruptException, in the catch block the interrupt flag would be reset, so this will not return true
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1145)
<https://reviews.apache.org/r/52559/#comment219981>
is it possible to add a lock object for this and add javadoc to make it explicit what it covers? I doesn't look like all ctx accesses are covered.
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1392)
<https://reviews.apache.org/r/52559/#comment219982>
nit: whitespace here and below
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1902)
<https://reviews.apache.org/r/52559/#comment219983>
same thing wrt interrupt state
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2221)
<https://reviews.apache.org/r/52559/#comment219984>
nit: might as well just pass TimeUnit.MILLISECONDS instead of * 1000
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2227)
<https://reviews.apache.org/r/52559/#comment219990>
1) these states do not appear thread-safe themselves
2) why are so many states needed, by the way?
e.g. what difference does it make for termination if it's compiling or running?
- Sergey Shelukhin
On Oct. 5, 2016, 4:56 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 5, 2016, 4:56 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 617
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line617>
> >
> > this will overwrite the previous error (this being finally) without logging it (other than to console). In fact the original error if any is probably more useful to the user...
Yeah, good catch, I converted them to LOG.warn and write them to the log.
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1424
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line1424>
> >
> > nit: typo
> >
> > also is it really true for every other possible state?
change the error msg "FAILED: Precompiled query has been cancelled or closed". Yes, the query should has been successfully compiled when user call this API since passes the flag alreadyCompiled value true
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 635
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line635>
> >
> > this ignores downstreamError. By design? Same for similar code below. The existing code with this pattern returns immediately in the catch that sets downstreamError, but here the execution can continue from the above catch
The existing code also has the finally block to invoke hooks, do perfLog etc after the catch which sets the downstreamError, this patch is only to add deferred close if applicable and do the state transition atomically.
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1582
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line1582>
> >
> > some interrupt-related return paths in the try above will result in isFinishedWithError being true here... it seems like that's by design right now because error and interrupt conditions are not distinguished here. It might be helpful to adcd a comment
added the comment to variable isFinishedWithError
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2015
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2015>
> >
> > same as above - should we return here? seems like the below ignores downstreamError on some paths
see comments above
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2271
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2271>
> >
> > does it have to throw here or can it just exit? the old code seems to only do stuff when in the right state, and nothing in the wrong state
I think it might be more helpful to raise the exception here if this API is called in a wrong state. For example, the existing getResult(..) checkes the state:
if (destroyed) {
throw new IOException("FAILED: Operation cancelled");
}
> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2418
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2418>
> >
> > I don't see an explicit check for this state anywhere in the code, except for some error paths. What is setting driver state to destroyed supposed to trigger?
In getResult(..) or resetFetch(), this state is explicitly checked. If the destroy is called, the driver will do the final cleaning up and after that, driver could no more be useful.
- Chaoyu
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152243
-----------------------------------------------------------
On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 11, 2016, 10:12 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152243
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 193)
<https://reviews.apache.org/r/52559/#comment221144>
comments for closed, destroyed, error and interrupt states would be useful
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 379)
<https://reviews.apache.org/r/52559/#comment221130>
nit: here and elsewhere, lock should be outside of the try, since we don't want to call unlock if the lock call itself fails
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 611)
<https://reviews.apache.org/r/52559/#comment221138>
this will overwrite the previous error (this being finally) without logging it (other than to console). In fact the original error if any is probably more useful to the user...
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 629)
<https://reviews.apache.org/r/52559/#comment221139>
this ignores downstreamError. By design? Same for similar code below. The existing code with this pattern returns immediately in the catch that sets downstreamError, but here the execution can continue from the above catch
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1399)
<https://reviews.apache.org/r/52559/#comment221133>
nit: typo
also is it really true for every other possible state?
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1557)
<https://reviews.apache.org/r/52559/#comment221135>
some interrupt-related return paths in the try above will result in isFinishedWithError being true here... it seems like that's by design right now because error and interrupt conditions are not distinguished here. It might be helpful to adcd a comment
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1989)
<https://reviews.apache.org/r/52559/#comment221141>
same as above - should we return here? seems like the below ignores downstreamError on some paths
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2245)
<https://reviews.apache.org/r/52559/#comment221142>
does it have to throw here or can it just exit? the old code seems to only do stuff when in the right state, and nothing in the wrong state
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2390)
<https://reviews.apache.org/r/52559/#comment221143>
I don't see an explicit check for this state anywhere in the code, except for some error paths. What is setting driver state to destroyed supposed to trigger?
- Sergey Shelukhin
On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 11, 2016, 10:12 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Yongzhi Chen <yc...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152659
-----------------------------------------------------------
The 8th version looks good to me. +1
- Yongzhi Chen
On Oct. 13, 2016, 11:38 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 13, 2016, 11:38 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 13, 2016, 11:38 p.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Changes
-------
revised an issue yongzhi raised
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Yongzhi Chen <yc...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152556
-----------------------------------------------------------
It is not thread safe for releaseDriverContext can be called in compling mode from cancel, but seems only null value matters. So use the local variable(driverCxt) to avoid NPE after the close() is called from cancel?
- Yongzhi Chen
On Oct. 12, 2016, 4:31 a.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 12, 2016, 4:31 a.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 12, 2016, 4:31 a.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Changes
-------
Revised based on the review feedback.
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
> On Oct. 12, 2016, 1:27 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2385
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2385>
> >
> > javadocs to distinguish close from destroy would be helpful
I added some comments about the close/destroy. It looks to me that they could be combined and I did not know the history and understand why there are two APIs. This patch just did some refactor to these methods and did not change their logic. Maybe it deserves a follow-up JIRA.
- Chaoyu
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152246
-----------------------------------------------------------
On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 11, 2016, 10:12 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152246
-----------------------------------------------------------
ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2357)
<https://reviews.apache.org/r/52559/#comment221145>
javadocs to distinguish close from destroy would be helpful
- Sergey Shelukhin
On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
>
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
>
> (Updated Oct. 11, 2016, 10:12 p.m.)
>
>
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
>
>
> Bugs: HIVE-14799
> https://issues.apache.org/jira/browse/HIVE-14799
>
>
> Repository: hive-git
>
>
> Description
> -------
>
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
>
>
> Diffs
> -----
>
> ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
>
> Diff: https://reviews.apache.org/r/52559/diff/
>
>
> Testing
> -------
>
> Manually tests
> Precommit tests
>
>
> Thanks,
>
> Chaoyu Tang
>
>
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 11, 2016, 10:12 p.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Changes
-------
revised the patch to fix the test failures
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 10, 2016, 12:42 a.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 10, 2016, 12:16 a.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Changes
-------
optimized the locking
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 9, 2016, 9:05 p.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Changes
-------
upload a new patch using n owaiting model.
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang
Re: Review Request 52559: HIVE-14799: Query operation are not thread
safe during its cancellation
Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------
(Updated Oct. 6, 2016, 5:10 a.m.)
Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
Changes
-------
Update the patch based on Sergio's comments
Bugs: HIVE-14799
https://issues.apache.org/jira/browse/HIVE-14799
Repository: hive-git
Description
-------
This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
Diffs (updated)
-----
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e
ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434
Diff: https://reviews.apache.org/r/52559/diff/
Testing
-------
Manually tests
Precommit tests
Thanks,
Chaoyu Tang