You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chaoyu Tang <ct...@gmail.com> on 2016/10/05 16:56:18 UTC

Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.

> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > common/src/java/org/apache/hadoop/hive/conf/HiveConf.java, line 3144
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523051#file1523051line3144>
> >
> >     nit: needs a better explanation from a user's perspective... user doesn't know what is a driver and why would it be closed is not clear here

rewording the explanation and hope it will be better. feel free to suggest one if you have a good one.


> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 346
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line346>
> >
> >     is this called from elsewhere too? the method existed before

It is just a default implementation for the interface CommandProcessor, and not used at all before. To be less confusing, I won't change this init() and the DriverState is default initiated as INITIALIZED.


> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 564
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line564>
> >
> >     iirc if you catch InterruptException, in the catch block the interrupt flag would be reset, so this will not return true

yes, the isInterrupt is implemented as
  private boolean isInterrupted() {
    return Thread.currentThread().isInterrupted() || interrupt;
  }
and so far we do not have wait/sleep which can throw InterruptException before any added checkpoints. In addition, the volatile variable interrupt is set when close is called and unset after it exits. So the isInterrupt is basically not affected by InterruptException and there is no concern.


> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1913
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line1913>
> >
> >     same thing wrt interrupt state

won't be a concern as commented above


> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2238
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line2238>
> >
> >     1) these states do not appear thread-safe themselves
> >     2) why are so many states needed, by the way?
> >     e.g. what difference does it make for termination if it's compiling or running?

1) the driverState is volatile, should be thread safe
2) they corresponds to different driver states since the driver is a stateful object. I remove the "UNINITILIAZED" and initialize DriverState default as INITIALIZED.
We only have state RUNNING which represents the driver is compiling, executing, or both in one call (like run with alreadyCompiled false). No matter the other query is compiling or executing, for close there is no difference and it should wait anyway until the other process finishes or interrupts, so one state RUNNING is sufficient to cover compiling or/and executing cases.


> On Oct. 5, 2016, 8:07 p.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1155
> > <https://reviews.apache.org/r/52559/diff/1/?file=1523052#file1523052line1155>
> >
> >     is it possible to add a lock object for this and add javadoc to make it explicit what it covers? I doesn't look like all ctx accesses are covered.

I changed to use a reentrant lock for all the synchronized calls. 
The reason I used the synchronized (this) since it is sufficient in this case and we do not need some advanced features from using the lock object, in addition, there is an existing method like releasePlan(plan) is already the synchronized method.
We do not the locks for ctx access since it has been gurantteed to be accessed by one thread since the close thread is halt in waiting when the query process is accesing this variable.


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review151537
-----------------------------------------------------------


On Oct. 5, 2016, 4:56 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2016, 4:56 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review151537
-----------------------------------------------------------




common/src/java/org/apache/hadoop/hive/conf/HiveConf.java (line 3144)
<https://reviews.apache.org/r/52559/#comment219978>

    nit: needs a better explanation from a user's perspective... user doesn't know what is a driver and why would it be closed is not clear here



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 344)
<https://reviews.apache.org/r/52559/#comment219979>

    is this called from elsewhere too? the method existed before



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 556)
<https://reviews.apache.org/r/52559/#comment219980>

    iirc if you catch InterruptException, in the catch block the interrupt flag would be reset, so this will not return true



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1145)
<https://reviews.apache.org/r/52559/#comment219981>

    is it possible to add a lock object for this and add javadoc to make it explicit what it covers? I doesn't look like all ctx accesses are covered.



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1392)
<https://reviews.apache.org/r/52559/#comment219982>

    nit: whitespace here and below



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1902)
<https://reviews.apache.org/r/52559/#comment219983>

    same thing wrt interrupt state



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2221)
<https://reviews.apache.org/r/52559/#comment219984>

    nit: might as well just pass TimeUnit.MILLISECONDS instead of * 1000



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2227)
<https://reviews.apache.org/r/52559/#comment219990>

    1) these states do not appear thread-safe themselves
    2) why are so many states needed, by the way?
    e.g. what difference does it make for termination if it's compiling or running?


- Sergey Shelukhin


On Oct. 5, 2016, 4:56 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 5, 2016, 4:56 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.

> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 617
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line617>
> >
> >     this will overwrite the previous error (this being finally) without logging it (other than to console). In fact the original error if any is probably more useful to the user...

Yeah, good catch, I converted them to LOG.warn and write them to the log.


> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1424
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line1424>
> >
> >     nit: typo
> >     
> >     also is it really true for every other possible state?

change the error msg "FAILED: Precompiled query has been cancelled or closed". Yes, the query should has been successfully compiled when user call this API since passes the flag alreadyCompiled value true


> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 635
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line635>
> >
> >     this ignores downstreamError. By design? Same for similar code below. The existing code with this pattern returns immediately in the catch that sets downstreamError, but here the execution can continue from the above catch

The existing code also has the finally block to invoke hooks, do perfLog etc after the catch which sets the downstreamError, this patch is only to add deferred close if applicable and do the state transition atomically.


> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 1582
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line1582>
> >
> >     some interrupt-related return paths in the try above will result in isFinishedWithError being true here... it seems like that's by design right now because error and interrupt conditions are not distinguished here. It might be helpful to adcd a comment

added the comment to variable isFinishedWithError


> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2015
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2015>
> >
> >     same as above - should we return here? seems like the below ignores downstreamError on some paths

see comments above


> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2271
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2271>
> >
> >     does it have to throw here or can it just exit? the old code seems to only do stuff when in the right state, and nothing in the wrong state

I think it might be more helpful to raise the exception here if this API is called in a wrong state. For example, the existing getResult(..) checkes the state:
    if (destroyed) {
      throw new IOException("FAILED: Operation cancelled");
}


> On Oct. 12, 2016, 1:25 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2418
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2418>
> >
> >     I don't see an explicit check for this state anywhere in the code, except for some error paths. What is setting driver state to destroyed supposed to trigger?

In getResult(..) or resetFetch(), this state is explicitly checked. If the destroy is called, the driver will do the final cleaning up and after that, driver could no more be useful.


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152243
-----------------------------------------------------------


On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 11, 2016, 10:12 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152243
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 193)
<https://reviews.apache.org/r/52559/#comment221144>

    comments for closed, destroyed, error and interrupt states would be useful



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 379)
<https://reviews.apache.org/r/52559/#comment221130>

    nit: here and elsewhere, lock should be outside of the try, since we don't want to call unlock if the lock call itself fails



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 611)
<https://reviews.apache.org/r/52559/#comment221138>

    this will overwrite the previous error (this being finally) without logging it (other than to console). In fact the original error if any is probably more useful to the user...



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 629)
<https://reviews.apache.org/r/52559/#comment221139>

    this ignores downstreamError. By design? Same for similar code below. The existing code with this pattern returns immediately in the catch that sets downstreamError, but here the execution can continue from the above catch



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1399)
<https://reviews.apache.org/r/52559/#comment221133>

    nit: typo
    
    also is it really true for every other possible state?



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1557)
<https://reviews.apache.org/r/52559/#comment221135>

    some interrupt-related return paths in the try above will result in isFinishedWithError being true here... it seems like that's by design right now because error and interrupt conditions are not distinguished here. It might be helpful to adcd a comment



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 1989)
<https://reviews.apache.org/r/52559/#comment221141>

    same as above - should we return here? seems like the below ignores downstreamError on some paths



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2245)
<https://reviews.apache.org/r/52559/#comment221142>

    does it have to throw here or can it just exit? the old code seems to only do stuff when in the right state, and nothing in the wrong state



ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2390)
<https://reviews.apache.org/r/52559/#comment221143>

    I don't see an explicit check for this state anywhere in the code, except for some error paths. What is setting driver state to destroyed supposed to trigger?


- Sergey Shelukhin


On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 11, 2016, 10:12 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Yongzhi Chen <yc...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152659
-----------------------------------------------------------



The 8th version looks good to me. +1

- Yongzhi Chen


On Oct. 13, 2016, 11:38 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 13, 2016, 11:38 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 13, 2016, 11:38 p.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Changes
-------

revised an issue yongzhi raised


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Yongzhi Chen <yc...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152556
-----------------------------------------------------------



It is not thread safe for releaseDriverContext can be called in compling mode from cancel, but seems only null value matters. So use the local variable(driverCxt) to avoid NPE after the close() is called from cancel?

- Yongzhi Chen


On Oct. 12, 2016, 4:31 a.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 12, 2016, 4:31 a.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 12, 2016, 4:31 a.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Changes
-------

Revised based on the review feedback.


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.

> On Oct. 12, 2016, 1:27 a.m., Sergey Shelukhin wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/Driver.java, line 2385
> > <https://reviews.apache.org/r/52559/diff/6/?file=1531289#file1531289line2385>
> >
> >     javadocs to distinguish close from destroy would be helpful

I added some comments about the close/destroy. It looks to me that they could be combined and I did not know the history and understand why there are two APIs. This patch just did some refactor to these methods and did not change their logic. Maybe it deserves a follow-up JIRA.


- Chaoyu


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152246
-----------------------------------------------------------


On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 11, 2016, 10:12 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/#review152246
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2357)
<https://reviews.apache.org/r/52559/#comment221145>

    javadocs to distinguish close from destroy would be helpful


- Sergey Shelukhin


On Oct. 11, 2016, 10:12 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/52559/
> -----------------------------------------------------------
> 
> (Updated Oct. 11, 2016, 10:12 p.m.)
> 
> 
> Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.
> 
> 
> Bugs: HIVE-14799
>     https://issues.apache.org/jira/browse/HIVE-14799
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
> 1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
> 2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 
> 
> Diff: https://reviews.apache.org/r/52559/diff/
> 
> 
> Testing
> -------
> 
> Manually tests
> Precommit tests
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 11, 2016, 10:12 p.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Changes
-------

revised the patch to fix the test failures


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 10, 2016, 12:42 a.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 10, 2016, 12:16 a.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Changes
-------

optimized the locking


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 9, 2016, 9:05 p.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Changes
-------

upload a new patch using n owaiting model.


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang


Re: Review Request 52559: HIVE-14799: Query operation are not thread safe during its cancellation

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/52559/
-----------------------------------------------------------

(Updated Oct. 6, 2016, 5:10 a.m.)


Review request for hive, Sergey Shelukhin, Thejas Nair, Vaibhav Gumashta, and Yongzhi Chen.


Changes
-------

Update the patch based on Sergio's comments


Bugs: HIVE-14799
    https://issues.apache.org/jira/browse/HIVE-14799


Repository: hive-git


Description
-------

This patch is going to fix a couple of Driver issues related to the close request from a thread other than the one running the query (e.g. from SQLOperation cancel via Timeout or Ctrl-C):
1. Driver is not thread safe and usually supports only one thread at time since it has variables like ctx, plan which are not thread protected. But certain special use cases need access the Driver objects from multiply threads. For example, when a query runs in a background thread, driver.close is invoked in another thread by the query timeout (see HIVE-4924). The close process could nullify the shared variables like ctx which could cause NPE in the other query thread which is using them. This runtime exception is unpredictable and not well handled in the code. Some resources (e.g. locks, files) are left behind and not be cleaned because there are no more available = references to them. In this patch, I use the waiting in the close which makes sure only one thread uses these variables and the resource cleaning happens after the query finished (or interrupted).
2. SQLOperation.cancel sends the interrupt signal to the background thread running the query (via backgroundHandle.cancel(true)) but it could not stop that process since there is no code to capture the signal in the process. In another word, current timeout code could not gracefully and promptly stop the query process, though it could eventually stop the process by killing the running tasks (e.g. MapRedTask) via driverContext.shutdown (see HIVE-5901). So in the patch, I added a couple of checkpoints to intercept the interrupt signal either set by close method (a volatile variable) or thread.interrupt. They should be helpful to capture these signals earlier , though not intermediately.


Diffs (updated)
-----

  common/src/java/org/apache/hadoop/hive/conf/HiveConf.java 4c3ef3e 
  ql/src/java/org/apache/hadoop/hive/ql/Driver.java dd55434 

Diff: https://reviews.apache.org/r/52559/diff/


Testing
-------

Manually tests
Precommit tests


Thanks,

Chaoyu Tang