You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Jason Dere <jd...@hortonworks.com> on 2017/10/26 02:42:26 UTC

Review Request 63319: HIVE-17908: LLAP External client not correctly handling killTask for pending requests

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63319/
-----------------------------------------------------------

Review request for hive and Siddharth Seth.


Bugs: HIVE-17908
    https://issues.apache.org/jira/browse/HIVE-17908


Repository: hive-git


Description
-------

- Pending requests should retry if killTask received
- Change retry delay to use exponential backoff


Diffs
-----

  llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java aa94e54 


Diff: https://reviews.apache.org/r/63319/diff/1/


Testing
-------


Thanks,

Jason Dere


Re: Review Request 63319: HIVE-17908: LLAP External client not correctly handling killTask for pending requests

Posted by Jason Dere <jd...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63319/#review190057
-----------------------------------------------------------




llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
Lines 82 (patched)
<https://reviews.apache.org/r/63319/#comment267311>

    Whoops, will change it to non-static.



llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
Lines 550 (patched)
<https://reviews.apache.org/r/63319/#comment267315>

    The retry could be initiated from 2 different places:
    1) From receiving a REJECTED response during the fragment submission
    2) From receiving a taskKilled notification, prior to the first task heartbeat.
    
    So both events would occur in the case that the fragment submission was rejected, and I wanted only one of the retries issue here to take effect. The semaphore was supposed to try to enforce that by having only one sender active at any time.
    
    Thinking about this again, I think another way to avoid this conflict would be to only call the retry for case (2), which should work as long as it is the case that a rejected fragment always receives a taskKilled notification. This wouldn't require the semapore.


- Jason Dere


On Oct. 26, 2017, 2:42 a.m., Jason Dere wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63319/
> -----------------------------------------------------------
> 
> (Updated Oct. 26, 2017, 2:42 a.m.)
> 
> 
> Review request for hive and Siddharth Seth.
> 
> 
> Bugs: HIVE-17908
>     https://issues.apache.org/jira/browse/HIVE-17908
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> - Pending requests should retry if killTask received
> - Change retry delay to use exponential backoff
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java aa94e54 
> 
> 
> Diff: https://reviews.apache.org/r/63319/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jason Dere
> 
>


Re: Review Request 63319: HIVE-17908: LLAP External client not correctly handling killTask for pending requests

Posted by Jason Dere <jd...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63319/
-----------------------------------------------------------

(Updated Nov. 3, 2017, 9:18 p.m.)


Review request for hive and Siddharth Seth.


Changes
-------

Updating patch per feedback from Sergey


Bugs: HIVE-17908
    https://issues.apache.org/jira/browse/HIVE-17908


Repository: hive-git


Description
-------

- Pending requests should retry if killTask received
- Change retry delay to use exponential backoff


Diffs (updated)
-----

  llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java aa94e54 


Diff: https://reviews.apache.org/r/63319/diff/2/

Changes: https://reviews.apache.org/r/63319/diff/1-2/


Testing
-------


Thanks,

Jason Dere


Re: Review Request 63319: HIVE-17908: LLAP External client not correctly handling killTask for pending requests

Posted by Sergey Shelukhin <se...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/63319/#review190055
-----------------------------------------------------------




llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
Lines 82 (patched)
<https://reviews.apache.org/r/63319/#comment267306>

    random is not thread safe



llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
Lines 370 (patched)
<https://reviews.apache.org/r/63319/#comment267307>

    nit: save in ctor/init?



llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java
Lines 550 (patched)
<https://reviews.apache.org/r/63319/#comment267308>

    I don't quite understand the logic around semaphore. For now it's try-acquired on send and released on response, so it seems like for retries after the response (in the callback) it's unneeded, and out of bounds retries like this won't work because it's still acquired while the response is still pending, so tryacquire would return false. Perhaps a comment would be helpful on the field about the lifecycle for acquire/release


- Sergey Shelukhin


On Oct. 26, 2017, 2:42 a.m., Jason Dere wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/63319/
> -----------------------------------------------------------
> 
> (Updated Oct. 26, 2017, 2:42 a.m.)
> 
> 
> Review request for hive and Siddharth Seth.
> 
> 
> Bugs: HIVE-17908
>     https://issues.apache.org/jira/browse/HIVE-17908
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> - Pending requests should retry if killTask received
> - Change retry delay to use exponential backoff
> 
> 
> Diffs
> -----
> 
>   llap-client/src/java/org/apache/hadoop/hive/llap/ext/LlapTaskUmbilicalExternalClient.java aa94e54 
> 
> 
> Diff: https://reviews.apache.org/r/63319/diff/1/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Jason Dere
> 
>