You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Chao Sun <ch...@cloudera.com> on 2015/04/22 02:30:28 UTC

Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/
-----------------------------------------------------------

Review request for hive and Marcelo Vanzin.


Bugs: HIVE-10434
    https://issues.apache.org/jira/browse/HIVE-10434


Repository: hive-git


Description
-------

This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
avoid potential long timeout.
It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..


Diffs
-----

  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 

Diff: https://reviews.apache.org/r/33422/diff/


Testing
-------

Tested on my own cluster, and it worked.


Thanks,

Chao Sun


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Marcelo Vanzin <va...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81520
-----------------------------------------------------------

Ship it!


Ship It!

- Marcelo Vanzin


On April 23, 2015, 6:54 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 23, 2015, 6:54 p.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/
-----------------------------------------------------------

(Updated April 23, 2015, 6:54 p.m.)


Review request for hive and Marcelo Vanzin.


Changes
-------

Addressing comments from Xuefu and Marcelo.


Bugs: HIVE-10434
    https://issues.apache.org/jira/browse/HIVE-10434


Repository: hive-git


Description
-------

This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
avoid potential long timeout.
It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..


Diffs (updated)
-----

  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 

Diff: https://reviews.apache.org/r/33422/diff/


Testing
-------

Tested on my own cluster, and it worked.


Thanks,

Chao Sun


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.

> On April 23, 2015, 6:22 p.m., Xuefu Zhang wrote:
> > spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java, line 176
> > <https://reviews.apache.org/r/33422/diff/2/?file=939013#file939013line176>
> >
> >     I'm wondering if cinfo can be null here. After the contains() check above, things might have changed. So, cinfo is not guaranteed to be not null.
> 
> Marcelo Vanzin wrote:
>     Yeah, that was my suggestion above. Don't use `containsKey`, instead just remove and check for null.

Yes, that's correct. I ignored that case as well..


- Chao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81361
-----------------------------------------------------------


On April 23, 2015, 6:54 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 23, 2015, 6:54 p.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Marcelo Vanzin <va...@cloudera.com>.

> On April 23, 2015, 6:22 p.m., Xuefu Zhang wrote:
> > spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java, line 176
> > <https://reviews.apache.org/r/33422/diff/2/?file=939013#file939013line176>
> >
> >     I'm wondering if cinfo can be null here. After the contains() check above, things might have changed. So, cinfo is not guaranteed to be not null.

Yeah, that was my suggestion above. Don't use `containsKey`, instead just remove and check for null.


- Marcelo


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81361
-----------------------------------------------------------


On April 23, 2015, 6:11 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 23, 2015, 6:11 p.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Xuefu Zhang <xz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81361
-----------------------------------------------------------



spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java
<https://reviews.apache.org/r/33422/#comment131705>

    I'm wondering if cinfo can be null here. After the contains() check above, things might have changed. So, cinfo is not guaranteed to be not null.


- Xuefu Zhang


On April 23, 2015, 6:11 p.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 23, 2015, 6:11 p.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/
-----------------------------------------------------------

(Updated April 23, 2015, 6:11 p.m.)


Review request for hive and Marcelo Vanzin.


Changes
-------

Addressing Marcelo's comment.


Bugs: HIVE-10434
    https://issues.apache.org/jira/browse/HIVE-10434


Repository: hive-git


Description
-------

This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
avoid potential long timeout.
It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..


Diffs (updated)
-----

  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 

Diff: https://reviews.apache.org/r/33422/diff/


Testing
-------

Tested on my own cluster, and it worked.


Thanks,

Chao Sun


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.

> On April 23, 2015, 4:55 p.m., Marcelo Vanzin wrote:
> > spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java, line 172
> > <https://reviews.apache.org/r/33422/diff/2/?file=939013#file939013line172>
> >
> >     To avoid races, I'd do:
> >     
> >         final ClientInfo cinfo = pendingClients.remove(clientId);
> >         if (cinfo == null) { /* nothing to do */ }

Yeah, this could happen if the method is called by multiple threads with same clientId, even though this shouldn't happen in the current case, I think. Should be fixed. Thanks for the suggestion.


- Chao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81328
-----------------------------------------------------------


On April 22, 2015, 1:25 a.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 22, 2015, 1:25 a.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Marcelo Vanzin <va...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81328
-----------------------------------------------------------

Ship it!


Just a minor thing left to fix.


spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java
<https://reviews.apache.org/r/33422/#comment131664>

    To avoid races, I'd do:
    
        final ClientInfo cinfo = pendingClients.remove(clientId);
        if (cinfo == null) { /* nothing to do */ }


- Marcelo Vanzin


On April 22, 2015, 1:25 a.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 22, 2015, 1:25 a.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/
-----------------------------------------------------------

(Updated April 22, 2015, 1:25 a.m.)


Review request for hive and Marcelo Vanzin.


Bugs: HIVE-10434
    https://issues.apache.org/jira/browse/HIVE-10434


Repository: hive-git


Description
-------

This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
avoid potential long timeout.
It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..


Diffs (updated)
-----

  spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
  spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 

Diff: https://reviews.apache.org/r/33422/diff/


Testing
-------

Tested on my own cluster, and it worked.


Thanks,

Chao Sun


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Chao Sun <ch...@cloudera.com>.

> On April 22, 2015, 12:38 a.m., Marcelo Vanzin wrote:
> > spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java, line 172
> > <https://reviews.apache.org/r/33422/diff/1/?file=938965#file938965line172>
> >
> >     This will throw an exception if the child process exits with a non-zero status after the RSC connects back to HS2. I don't think you want that.

Oh yes. I forgot that case.


- Chao


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81103
-----------------------------------------------------------


On April 22, 2015, 12:30 a.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 22, 2015, 12:30 a.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>


Re: Review Request 33422: HIVE-10434 - Cancel connection when remote Spark driver process has failed [Spark Branch]

Posted by Marcelo Vanzin <va...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/33422/#review81103
-----------------------------------------------------------



spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java
<https://reviews.apache.org/r/33422/#comment131349>

    This will throw an exception if the child process exits with a non-zero status after the RSC connects back to HS2. I don't think you want that.



spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java
<https://reviews.apache.org/r/33422/#comment131351>

    While the only current call site reflects the error message, this method seems more generic than that. Maybe pass the error message as a parameter to the method?


- Marcelo Vanzin


On April 22, 2015, 12:30 a.m., Chao Sun wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/33422/
> -----------------------------------------------------------
> 
> (Updated April 22, 2015, 12:30 a.m.)
> 
> 
> Review request for hive and Marcelo Vanzin.
> 
> 
> Bugs: HIVE-10434
>     https://issues.apache.org/jira/browse/HIVE-10434
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> This patch cancels the connection from HS2 to remote process once the latter has failed and exited with error code, to
> avoid potential long timeout.
> It add a new public method cancelClient to the RpcServer class - not sure whether there's an easier way to do this..
> 
> 
> Diffs
> -----
> 
>   spark-client/src/main/java/org/apache/hive/spark/client/SparkClientImpl.java 71e432d 
>   spark-client/src/main/java/org/apache/hive/spark/client/rpc/RpcServer.java 32d4c46 
> 
> Diff: https://reviews.apache.org/r/33422/diff/
> 
> 
> Testing
> -------
> 
> Tested on my own cluster, and it worked.
> 
> 
> Thanks,
> 
> Chao Sun
> 
>