You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/08/19 10:01:24 UTC

[GitHub] [incubator-doris] ccoffline opened a new issue #6477: [Performance] show proc statistic takes too long

ccoffline opened a new issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477


   ```sql
   show proc 'statistic';
   ```
   
   This query will calculate all replicas, which will takes very long on a large cluster.
   Statistic don't need to calculate every time, one update in one minute is enough.
   This could calculate in parallel.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
caiconghui commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-902410160


   which doris version do you use? and how much replicas total in cluster? how long dose it task for show proc one time?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui closed issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
caiconghui closed issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
ccoffline commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-903421823


   > which doris version do you use? and how much replicas total in cluster? how long dose it task for show proc one time?
   
   * Rebase on 0.13.11 and have the same impl at `StatisticProcDir.java`
   * We have many large scale clusters, many of which cost at least 5-10s querying `/statistic`. The biggest cluster has 45,000,000 replicas.
   * This proc is currently calculating twice on all the replicas per query because of the bad interface design. It has to calculate once when fetch the meta, and calculate again. Besides, this proc status doesn't need to refresh every time, maybe cache in a minute.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline edited a comment on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
ccoffline edited a comment on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-904411893


   > I think that it is needed to speed up caculating, but in most case, we need to know the real time cluster replica statistic, cache may be confused when we need to track some cluster problems
   
   The cache is configurable, maybe set cache timeout 0 to disable. The cache is mostly to prevent high frequency query in a short time that may cause high calculation cost and heavy lock competitions, maybe 1s is close enough from real-time refreshing.
   
   The cache is easy to implement, the state is restored in the previous version and may cause concurrent issue, so the cache is kind of fixing a concurrent bug rather than a new feature. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
ccoffline commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-903432100


   I'll push a pr later on, this is just a notice.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
ccoffline commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-904411893


   > I think that it is needed to speed up caculating, but in most case, we need to know the real time cluster replica statistic, cache may be confused when we need to track some cluster problems
   
   The cache is configurable, maybe set cache timeout 0 to disable. The cache is mostly to prevent high frequency query in a short time that may cause high calculation cost and heavy lock competitions, maybe 1s is close enough from real-time refresh.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
ccoffline commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-905489753


   > may we can provide two choice, sync or async, so that for high frequency query we can use async model, for other normal user, just speed up caculating and use sync model, what do you think about it?
   
   I'll disable the cache in default, or just remove it if you insist.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] ccoffline edited a comment on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
ccoffline edited a comment on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-904411893


   > I think that it is needed to speed up caculating, but in most case, we need to know the real time cluster replica statistic, cache may be confused when we need to track some cluster problems
   
   The cache is configurable, maybe set cache timeout 0 to disable. The cache is mostly to prevent high frequency query in a short time that may cause high calculation cost and heavy lock competitions, maybe 1s is close enough from real-time refresh.
   
   The cache is easy to implement, the state is restored in the previous version and may cause concurrent issue, so the cache is kind of fixing a concurrent bug rather than a new feature. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
caiconghui commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-904430361


   > > I think that it is needed to speed up caculating, but in most case, we need to know the real time cluster replica statistic, cache may be confused when we need to track some cluster problems
   > 
   > The cache is configurable, maybe set cache timeout 0 to disable. The cache is mostly to prevent high frequency query in a short time that may cause high calculation cost and heavy lock competitions, maybe 1s is close enough from real-time refreshing.
   > 
   > The cache is easy to implement, the state is restored in the previous version and may cause concurrent issue, so the cache is kind of fixing a concurrent bug rather than a new feature.
   
   may we can provide two choice, sync or async, so that for high frequency query we can use async model, for other normal user, just speed up caculating and use sync model, what do you think about it? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] caiconghui commented on issue #6477: [Performance] show proc statistic takes too long

Posted by GitBox <gi...@apache.org>.
caiconghui commented on issue #6477:
URL: https://github.com/apache/incubator-doris/issues/6477#issuecomment-903477058


   > I'll push a pr later on, this is just a notice.
   
   I think that it is needed to speed up caculating, but in most case, we need to know the real time cluster replica statistic, cache may be confused when we  need to track some cluster problems


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org