You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@uniffle.apache.org by GitBox <gi...@apache.org> on 2022/11/11 06:43:03 UTC

[GitHub] [incubator-uniffle] leixm opened a new issue, #309: [FEATURE] Support client request latency metrics

leixm opened a new issue, #309:
URL: https://github.com/apache/incubator-uniffle/issues/309

   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   
   
   ### Search before asking
   
   - [X] I have searched in the [issues](https://github.com/apache/incubator-uniffle/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Describe the feature
   
   When the ShuffleServer load is high, we cannot directly judge whether the client read and write has been greatly affected according to the metrics.
   
   
   ### Motivation
   
   Accurately determine whether the current service load has caused a large delay to the client's read and write.
   
   ### Describe the solution
   
   Delay monitoring is divided into two parts. The first part is the delay of ShuffleServer processing logic. Here we can directly add metrics. The second part is before ShuffleServer processing logic, including network delay and rpc queue waiting time.
   For the second part, maybe we can record the timestamp of the request before the client initiates the read and write request, and include this timestamp in the request. When ShuffleServer receives the request it can know how long the delay time is and record it in the metrics of ShuffleServer, maybe grpc also supports related implementations.
   We can measure the processing delay of the current ShuffleServer through some monitoring indicators such as p95 and p99.
   
   ### Additional context
   
   No
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] zuston commented on issue #309: [FEATURE] Support ShuffleServer latency metrics

Posted by GitBox <gi...@apache.org>.
zuston commented on issue #309:
URL: https://github.com/apache/incubator-uniffle/issues/309#issuecomment-1311610167

   Sounds great! Previously I introduce some grpc metrics to measure the pressure, but it's not effective.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] leixm commented on issue #309: [FEATURE] Support ShuffleServer latency metrics

Posted by GitBox <gi...@apache.org>.
leixm commented on issue #309:
URL: https://github.com/apache/incubator-uniffle/issues/309#issuecomment-1311292533

   @jerqi  @zuston  Do you think this feature needs to be supported?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #309: [FEATURE] Support ShuffleServer latency metrics

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #309:
URL: https://github.com/apache/incubator-uniffle/issues/309#issuecomment-1311300381

   For batch process system, throughout in specific time may be more important than latency. The latency may be related to amount of data, too. It's ok for me to support some latency metrics.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] leixm commented on issue #309: [FEATURE] Support ShuffleServer latency metrics

Posted by GitBox <gi...@apache.org>.
leixm commented on issue #309:
URL: https://github.com/apache/incubator-uniffle/issues/309#issuecomment-1315066776

   > For batch process system, throughout in specific time may be more important than latency. The latency may be related to amount of data, too. It's ok for me to support some latency metrics.
   
   You are right, but if the latency continues to be high for a long time, it will affect the task, slow down or even cause the task to fail. We can judge whether it may have affected the task according to the fluctuation of latency.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi commented on issue #309: [FEATURE] Support ShuffleServer latency metrics

Posted by GitBox <gi...@apache.org>.
jerqi commented on issue #309:
URL: https://github.com/apache/incubator-uniffle/issues/309#issuecomment-1315070665

   > > For batch process system, throughout in specific time may be more important than latency. The latency may be related to amount of data, too. It's ok for me to support some latency metrics.
   > 
   > You are right, but if the latency continues to be high for a long time, it will affect the task, slow down or even cause the task to fail. We can judge whether it may have affected the task according to the fluctuation of latency.
   
   Ok, you can go ahead.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-uniffle] jerqi closed issue #309: [FEATURE] Support ShuffleServer latency metrics

Posted by GitBox <gi...@apache.org>.
jerqi closed issue #309: [FEATURE] Support ShuffleServer latency metrics
URL: https://github.com/apache/incubator-uniffle/issues/309


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@uniffle.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org