You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Akshay Goyal <ak...@gmail.com> on 2016/03/02 12:09:10 UTC

Re: Review Request 42134: More information to user on GetOperationStatus in Hive Server2 when query is still executing

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42134/
-----------------------------------------------------------

(Updated March 2, 2016, 11:09 a.m.)


Review request for hive and Amareshwari Sriramadasu.


Changes
-------

review comments addressed.


Bugs: HIVE-4570
    https://issues.apache.org/jira/browse/HIVE-4570


Repository: hive-git


Description
-------

Driver maintains list of running and runnable tasks although that info is not exposed outside. It's kept locally in the driver's execute method. We can add Driver.getTaskStatuses() to return status on all tasks (both running and runnable). Similarly, start and completion times for operations.

Proposed changes are :

struct TGetOperationStatusResp {
  1: required TStatus status
  2: optional TOperationState operationState

  // If operationState is ERROR_STATE, then the following fields may be set
  // sqlState as defined in the ISO/IEF CLI specification
  3: optional string sqlState

  // Internal error code
  4: optional i32 errorCode

  // Error message
  5: optional string errorMessage

  // List of statuses of sub tasks
  6: optional string taskStatus

  // When was the operation started
  7: optional i64 operationStarted
  // When was the operation completed
  8: optional i64 operationCompleted

}


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 3253146 
  ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java e199e5e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java b184b4e 
  service-rpc/if/TCLIService.thrift 0aa9d13 
  service/src/java/org/apache/hive/service/cli/OperationStatus.java e45b828 
  service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java 8868ec1 
  service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 35b6c52 
  service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java 8db2e62 
  service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java d6f6280 
  service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java a09b39a 
  service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 740b851 
  service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 2a0fec2 
  service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java f5a9771 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java 22f725c 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 100dc6a 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 8dff264 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 5f01165 
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java e78181a 

Diff: https://reviews.apache.org/r/42134/diff/


Testing
-------


Thanks,

Akshay Goyal


Re: Review Request 42134: More information to user on GetOperationStatus in Hive Server2 when query is still executing

Posted by Rajat Khandelwal <ra...@gmail.com>.

> On March 3, 2016, 6:20 a.m., Szehon Ho wrote:
> > This requirement is very similar to webui's display of operations, can you look at QueryDisplay and see if it can be enhanced and combined (rather than repeating all this container classes).  That class also gives you concurrency protection for free as well as marking start/end time for free ..

Tasks are maintained by the driver, and they are only sent to query display when it is started. So Query display can not track other states, e.g. QUEUED, INITIALIZED etc. I don't think query display is the way to solve this. Please suggest some alternative.

I'll be taking up this issue, and hence I've created a new review request. Please review at https://reviews.apache.org/r/44453/diff/1#index_header


> On March 3, 2016, 6:20 a.m., Szehon Ho wrote:
> > ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java, line 412
> > <https://reviews.apache.org/r/42134/diff/5/?file=1276514#file1276514line412>
> >
> >     Is it possible to change Task class so that we just keep around state member variable, instead of individual booleans for done, queued, etc?
> >     
> >     Then it would be the other way, isDone would just return the state == DONE, etc.

Fixed.


> On March 3, 2016, 6:20 a.m., Szehon Ho wrote:
> > service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java, line 135
> > <https://reviews.apache.org/r/42134/diff/5/?file=1276527#file1276527line135>
> >
> >     Can we put this under onNewState, check the state if its a RUNNING state and set the time there, instead of calling everywhere.
> >     
> >     Same for markOperationCompletedTime(), check if state is terminal state and set the time there.
> >     
> >     Same comment across all operations..

Fixed.


- Rajat


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42134/#review121751
-----------------------------------------------------------


On March 2, 2016, 5:14 p.m., Akshay Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42134/
> -----------------------------------------------------------
> 
> (Updated March 2, 2016, 5:14 p.m.)
> 
> 
> Review request for hive and Amareshwari Sriramadasu.
> 
> 
> Bugs: HIVE-4570
>     https://issues.apache.org/jira/browse/HIVE-4570
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Driver maintains list of running and runnable tasks although that info is not exposed outside. It's kept locally in the driver's execute method. We can add Driver.getTaskStatuses() to return status on all tasks (both running and runnable). Similarly, start and completion times for operations.
> 
> Proposed changes are :
> 
> struct TGetOperationStatusResp {
>   1: required TStatus status
>   2: optional TOperationState operationState
> 
>   // If operationState is ERROR_STATE, then the following fields may be set
>   // sqlState as defined in the ISO/IEF CLI specification
>   3: optional string sqlState
> 
>   // Internal error code
>   4: optional i32 errorCode
> 
>   // Error message
>   5: optional string errorMessage
> 
>   // List of statuses of sub tasks
>   6: optional string taskStatus
> 
>   // When was the operation started
>   7: optional i64 operationStarted
>   // When was the operation completed
>   8: optional i64 operationCompleted
> 
> }
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 3253146 
>   ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java e199e5e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java b184b4e 
>   service-rpc/if/TCLIService.thrift 0aa9d13 
>   service/src/java/org/apache/hive/service/cli/OperationStatus.java e45b828 
>   service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java 8868ec1 
>   service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 35b6c52 
>   service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java 8db2e62 
>   service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java d6f6280 
>   service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java a09b39a 
>   service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 740b851 
>   service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 2a0fec2 
>   service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java f5a9771 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 22f725c 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 100dc6a 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 8dff264 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 5f01165 
>   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java e78181a 
> 
> Diff: https://reviews.apache.org/r/42134/diff/
> 
> 
> Testing
> -------
> 
> Test cases added to validate task status and operation start and end times.
> 
> Task status json seen with one stage is: [{"taskId":"Stage-1","externalHandle":"job_local1121055131_0136","taskState":"FINISHED_STATE"}]
> 
> And TaskStatus toString() returns Stage-1/job_local1121055131_0136/FINISHED_STATE for the same.
> 
> 
> Thanks,
> 
> Akshay Goyal
> 
>


Re: Review Request 42134: More information to user on GetOperationStatus in Hive Server2 when query is still executing

Posted by Szehon Ho <sz...@cloudera.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42134/#review121751
-----------------------------------------------------------



This requirement is very similar to webui's display of operations, can you look at QueryDisplay and see if it can be enhanced and combined (rather than repeating all this container classes).  That class also gives you concurrency protection for free as well as marking start/end time for free ..


ql/src/java/org/apache/hadoop/hive/ql/Driver.java (line 2009)
<https://reviews.apache.org/r/42134/#comment183563>

    This looks risky in terms of concurrency.  I dont see much concurrency protection.. Driver is used by one thread and getStatus is called by another , I believe.



ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java (line 412)
<https://reviews.apache.org/r/42134/#comment183561>

    Is it possible to change Task class so that we just keep around state member variable, instead of individual booleans for done, queued, etc?
    
    Then it would be the other way, isDone would just return the state == DONE, etc.



service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java (line 132)
<https://reviews.apache.org/r/42134/#comment183559>

    Can we put this under onNewState, check the state if its a RUNNING state and set the time there, instead of calling everywhere.
    
    Same for markOperationCompletedTime(), check if state is terminal state and set the time there.
    
    Same comment across all operations..


- Szehon Ho


On March 2, 2016, 11:44 a.m., Akshay Goyal wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/42134/
> -----------------------------------------------------------
> 
> (Updated March 2, 2016, 11:44 a.m.)
> 
> 
> Review request for hive and Amareshwari Sriramadasu.
> 
> 
> Bugs: HIVE-4570
>     https://issues.apache.org/jira/browse/HIVE-4570
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Driver maintains list of running and runnable tasks although that info is not exposed outside. It's kept locally in the driver's execute method. We can add Driver.getTaskStatuses() to return status on all tasks (both running and runnable). Similarly, start and completion times for operations.
> 
> Proposed changes are :
> 
> struct TGetOperationStatusResp {
>   1: required TStatus status
>   2: optional TOperationState operationState
> 
>   // If operationState is ERROR_STATE, then the following fields may be set
>   // sqlState as defined in the ISO/IEF CLI specification
>   3: optional string sqlState
> 
>   // Internal error code
>   4: optional i32 errorCode
> 
>   // Error message
>   5: optional string errorMessage
> 
>   // List of statuses of sub tasks
>   6: optional string taskStatus
> 
>   // When was the operation started
>   7: optional i64 operationStarted
>   // When was the operation completed
>   8: optional i64 operationCompleted
> 
> }
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/Driver.java 3253146 
>   ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java PRE-CREATION 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java e199e5e 
>   ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java b184b4e 
>   service-rpc/if/TCLIService.thrift 0aa9d13 
>   service/src/java/org/apache/hive/service/cli/OperationStatus.java e45b828 
>   service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java 8868ec1 
>   service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 35b6c52 
>   service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java 8db2e62 
>   service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java d6f6280 
>   service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java a09b39a 
>   service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 740b851 
>   service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 2a0fec2 
>   service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java f5a9771 
>   service/src/java/org/apache/hive/service/cli/operation/Operation.java 22f725c 
>   service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 100dc6a 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 8dff264 
>   service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 5f01165 
>   service/src/test/org/apache/hive/service/cli/CLIServiceTest.java e78181a 
> 
> Diff: https://reviews.apache.org/r/42134/diff/
> 
> 
> Testing
> -------
> 
> Test cases added to validate task status and operation start and end times.
> 
> Task status json seen with one stage is: [{"taskId":"Stage-1","externalHandle":"job_local1121055131_0136","taskState":"FINISHED_STATE"}]
> 
> And TaskStatus toString() returns Stage-1/job_local1121055131_0136/FINISHED_STATE for the same.
> 
> 
> Thanks,
> 
> Akshay Goyal
> 
>


Re: Review Request 42134: More information to user on GetOperationStatus in Hive Server2 when query is still executing

Posted by Akshay Goyal <ak...@gmail.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/42134/
-----------------------------------------------------------

(Updated March 2, 2016, 11:44 a.m.)


Review request for hive and Amareshwari Sriramadasu.


Bugs: HIVE-4570
    https://issues.apache.org/jira/browse/HIVE-4570


Repository: hive-git


Description
-------

Driver maintains list of running and runnable tasks although that info is not exposed outside. It's kept locally in the driver's execute method. We can add Driver.getTaskStatuses() to return status on all tasks (both running and runnable). Similarly, start and completion times for operations.

Proposed changes are :

struct TGetOperationStatusResp {
  1: required TStatus status
  2: optional TOperationState operationState

  // If operationState is ERROR_STATE, then the following fields may be set
  // sqlState as defined in the ISO/IEF CLI specification
  3: optional string sqlState

  // Internal error code
  4: optional i32 errorCode

  // Error message
  5: optional string errorMessage

  // List of statuses of sub tasks
  6: optional string taskStatus

  // When was the operation started
  7: optional i64 operationStarted
  // When was the operation completed
  8: optional i64 operationCompleted

}


Diffs
-----

  ql/src/java/org/apache/hadoop/hive/ql/Driver.java 3253146 
  ql/src/java/org/apache/hadoop/hive/ql/TaskStatus.java PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/Task.java e199e5e 
  ql/src/java/org/apache/hadoop/hive/ql/exec/mr/ExecDriver.java b184b4e 
  service-rpc/if/TCLIService.thrift 0aa9d13 
  service/src/java/org/apache/hive/service/cli/OperationStatus.java e45b828 
  service/src/java/org/apache/hive/service/cli/operation/GetCatalogsOperation.java 8868ec1 
  service/src/java/org/apache/hive/service/cli/operation/GetColumnsOperation.java 35b6c52 
  service/src/java/org/apache/hive/service/cli/operation/GetFunctionsOperation.java 8db2e62 
  service/src/java/org/apache/hive/service/cli/operation/GetSchemasOperation.java d6f6280 
  service/src/java/org/apache/hive/service/cli/operation/GetTableTypesOperation.java a09b39a 
  service/src/java/org/apache/hive/service/cli/operation/GetTablesOperation.java 740b851 
  service/src/java/org/apache/hive/service/cli/operation/GetTypeInfoOperation.java 2a0fec2 
  service/src/java/org/apache/hive/service/cli/operation/HiveCommandOperation.java f5a9771 
  service/src/java/org/apache/hive/service/cli/operation/Operation.java 22f725c 
  service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java 100dc6a 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIService.java 8dff264 
  service/src/java/org/apache/hive/service/cli/thrift/ThriftCLIServiceClient.java 5f01165 
  service/src/test/org/apache/hive/service/cli/CLIServiceTest.java e78181a 

Diff: https://reviews.apache.org/r/42134/diff/


Testing (updated)
-------

Test cases added to validate task status and operation start and end times.

Task status json seen with one stage is: [{"taskId":"Stage-1","externalHandle":"job_local1121055131_0136","taskState":"FINISHED_STATE"}]

And TaskStatus toString() returns Stage-1/job_local1121055131_0136/FINISHED_STATE for the same.


Thanks,

Akshay Goyal