You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Yongzhi Chen <yc...@cloudera.com> on 2017/04/14 13:14:52 UTC

Review Request 58456: Query cancel: improve the way to handle files

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/
-----------------------------------------------------------

Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.


Bugs: HIVE-16426
    https://issues.apache.org/jira/browse/HIVE-16426


Repository: hive-git


Description
-------

1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
2. Add checkpoints for file operations.
3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
  ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 


Diff: https://reviews.apache.org/r/58456/diff/1/


Testing
-------

Manually tested.


Thanks,

Yongzhi Chen


Re: Review Request 58456: Query cancel: improve the way to handle files

Posted by Aihua Xu <ax...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172209
-----------------------------------------------------------


Ship it!




Ship It!

- Aihua Xu


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 
> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>


Re: Review Request 58456: Query cancel: improve the way to handle files

Posted by Yongzhi Chen <yc...@cloudera.com>.

> On April 19, 2017, 5:50 p.m., Chaoyu Tang wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
> > Lines 46 (patched)
> > <https://reviews.apache.org/r/58456/diff/1/?file=1692688#file1692688line46>
> >
> >     To be honest, I am not very comfortable to import the Driver here. I thought the CombineHiveInputFormat in io package is at a lower architecture layer than Driver ql. 
> >     Is there any other way which we can detect if the thread has been interrupted (e.g. Thread.getCurrentThread().isInterrupted() etc?
> >     Also as I recall (if I am right), there might be a class which handles this interrupt signal globally, I could not find it at this moment.

The check in CombineHiveInputFormat is just to check the threadlocal object, it is following the same pattern to check hive's own cancel related status. I think we are trying to avoid use the 
In the CombineHiveInputFormat, it can include:
import org.apache.hadoop.hive.ql.exec.Operator;
import org.apache.hadoop.hive.ql.exec.Utilities;

I do not think it is a problem to import
org.apache.hadoop.hive.ql.Driver


> On April 19, 2017, 5:50 p.m., Chaoyu Tang wrote:
> > service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
> > Lines 399 (patched)
> > <https://reviews.apache.org/r/58456/diff/1/?file=1692689#file1692689line399>
> >
> >     As I understand, basically the cleanup is called with the parameter state value CANCELED, TIMEOUT and CLOSED, and here you are trying to address the race issue in the normal CLOSE case where the thread should not be interrupted and further clean the tmp file. Is it right?
> >     Another thought, could moving the code
> >     {code}
> >           ss.deleteTmpOutputFile();
> >           ss.deleteTmpErrOutputFile();
> >     {code}
> >     from sqlOperation to driver close() or destroy() will be help to solve the problem?

The exception error "Failed to clean-up tmp directories." is from 
Utilities.clearWork(job); from execute, the clearWork cleans the folders used for map and reduce plan path.
ss.deleteTmpOutputFile(); ss.deleteTmpErrOutputFile(); is to clean the output data tmp folder, so they are different. 

  /**
   * Temporary file name used to store results of non-Hive commands (e.g., set, dfs)
   * and HiveServer.fetch*() function will read results from this file
   */
  protected File tmpOutputFile;

  /**
   * Temporary file name used to store error output of executing non-Hive commands (e.g., set, dfs)
   */
  protected File tmpErrOutputFile;


- Yongzhi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172374
-----------------------------------------------------------


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 
> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>


Re: Review Request 58456: Query cancel: improve the way to handle files

Posted by Chaoyu Tang <ct...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172374
-----------------------------------------------------------




ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java
Lines 46 (patched)
<https://reviews.apache.org/r/58456/#comment245480>

    To be honest, I am not very comfortable to import the Driver here. I thought the CombineHiveInputFormat in io package is at a lower architecture layer than Driver ql. 
    Is there any other way which we can detect if the thread has been interrupted (e.g. Thread.getCurrentThread().isInterrupted() etc?
    Also as I recall (if I am right), there might be a class which handles this interrupt signal globally, I could not find it at this moment.



service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
Lines 399 (patched)
<https://reviews.apache.org/r/58456/#comment245472>

    As I understand, basically the cleanup is called with the parameter state value CANCELED, TIMEOUT and CLOSED, and here you are trying to address the race issue in the normal CLOSE case where the thread should not be interrupted and further clean the tmp file. Is it right?
    Another thought, could moving the code
    {code}
          ss.deleteTmpOutputFile();
          ss.deleteTmpErrOutputFile();
    {code}
    from sqlOperation to driver close() or destroy() will be help to solve the problem?


- Chaoyu Tang


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 
> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>


Re: Review Request 58456: Query cancel: improve the way to handle files

Posted by Yongzhi Chen <yc...@cloudera.com>.

> On April 14, 2017, 5:43 p.m., Aihua Xu wrote:
> > service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
> > Lines 399 (patched)
> > <https://reviews.apache.org/r/58456/diff/1/?file=1692689#file1692689line399>
> >
> >     I'm not exactly following what we are doing here. Not sure how background thread gets closed later.
> >     
> >     Otherwise, the other changes look good.

The background thread will complete the task or gracefully closed with the guidance of the cancel status. Our current cancel design majorly follows the pattern that cancel command set the cancel status, the working thread(background thread) check the cancel status and decide to quit or continue. The backgroundHandle.cancel(true) does not follow the pattern and cause some conflicts. The following warning log is caused by this:
2017-04-11 09:57:30,727 WARN  org.apache.hadoop.hive.ql.exec.Utilities: [HiveServer2-Background-Pool: Thread-149]: Failed to clean-up tmp directories.
java.io.InterruptedIOException: Call interrupted


- Yongzhi


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172009
-----------------------------------------------------------


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 
> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>


Re: Review Request 58456: Query cancel: improve the way to handle files

Posted by Aihua Xu <ax...@cloudera.com>.

> On April 14, 2017, 5:43 p.m., Aihua Xu wrote:
> > service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
> > Lines 399 (patched)
> > <https://reviews.apache.org/r/58456/diff/1/?file=1692689#file1692689line399>
> >
> >     I'm not exactly following what we are doing here. Not sure how background thread gets closed later.
> >     
> >     Otherwise, the other changes look good.
> 
> Yongzhi Chen wrote:
>     The background thread will complete the task or gracefully closed with the guidance of the cancel status. Our current cancel design majorly follows the pattern that cancel command set the cancel status, the working thread(background thread) check the cancel status and decide to quit or continue. The backgroundHandle.cancel(true) does not follow the pattern and cause some conflicts. The following warning log is caused by this:
>     2017-04-11 09:57:30,727 WARN  org.apache.hadoop.hive.ql.exec.Utilities: [HiveServer2-Background-Pool: Thread-149]: Failed to clean-up tmp directories.
>     java.io.InterruptedIOException: Call interrupted

I see. Got it.


- Aihua


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172009
-----------------------------------------------------------


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 
> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>


Re: Review Request 58456: Query cancel: improve the way to handle files

Posted by Aihua Xu <ax...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/58456/#review172009
-----------------------------------------------------------




service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java
Lines 399 (patched)
<https://reviews.apache.org/r/58456/#comment245060>

    I'm not exactly following what we are doing here. Not sure how background thread gets closed later.
    
    Otherwise, the other changes look good.


- Aihua Xu


On April 14, 2017, 1:14 p.m., Yongzhi Chen wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/58456/
> -----------------------------------------------------------
> 
> (Updated April 14, 2017, 1:14 p.m.)
> 
> 
> Review request for hive, Aihua Xu, Chaoyu Tang, and Sergio Pena.
> 
> 
> Bugs: HIVE-16426
>     https://issues.apache.org/jira/browse/HIVE-16426
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> 1. Use threadlocal variable to store cancel state to make it is accessible without being passed around by parameters. 
> 2. Add checkpoints for file operations.
> 3. Remove backgroundHandle.cancel to avoid failed file cleanup because of the interruption. By what I observed that the method seems not very effective for scheduled operation, for example, the on going HMS API calls.
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java a80004662068eb2391c0dd7062f77156b222375b 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Utilities.java b0657f01d4482dc8bb8dc180e5e7deffbdb533e6 
>   ql/src/java/org/apache/hadoop/hive/ql/io/CombineHiveInputFormat.java 7a113bf8e5c4dd8c2c486741a5ebc7b8940e746b 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 04fc0a17c93120b8f6e6d7c36e4d70631d56baca 
> 
> 
> Diff: https://reviews.apache.org/r/58456/diff/1/
> 
> 
> Testing
> -------
> 
> Manually tested.
> 
> 
> Thanks,
> 
> Yongzhi Chen
> 
>