You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@skywalking.apache.org by GitBox <gi...@apache.org> on 2021/06/08 08:15:06 UTC

[GitHub] [skywalking] CrocoRyan opened a new issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

CrocoRyan opened a new issue #7082:
URL: https://github.com/apache/skywalking/issues/7082


   Please answer these questions before submitting your issue.
   
   - Why do you submit this issue?
   - [✓ ] Bug
   ___
   ### Bug
   - Which version of SkyWalking, OS, and JRE?
   - Skywalking 8.5.0.(latest release)
   - OS: MacOS Catalina
   - JRE: openjdk version "1.8.0_265"
   
   - What happened?
   - When we use shrding-jdbc to process jdbc statements, due to sharding-sphere, the original statement will be parsed and breaks up into several actual statements to execute. As we want to obtain metrics like CPM and latency for each original statement, both of them, however, are inaccurate. That's because all the actual statements with high cardinality derived from sharding operation was taken into account when calculating the CPM & Latency metrics for the original statement, whereas the original statement which we found valuable doesn't.
   
   In my personal view, I believe that the key problem is that the entire branch of sub-trace ranging from `/ShardingSphere/JDBCRootInvoke/` endpoint to the last `executeQuery` span cannot obtain the original jdbc statement. Therefore, tag `db.statement` enclose the actual statement instead of the original statement, so when aggregating metrics, traffic was dispersed to multiple sharded statements' groups.
   
   To better demonstrate this problem, later I'll upload a small demo of this issue


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] lujiajing1126 commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
lujiajing1126 commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856629364


   > > In some of our business units, they are using Skywalking and want to have a dashboard for SQL(s) metrics.
   > 
   > But we don't have anything on upstream, so where could we start to discuss? If this is something not existing in open source, or accepted by the community.
   > 
   > > But traces cannot provide aggregational information like percentiles...
   > 
   > Again, no use case for upsteam. We can't discuss a thing not existing in public.
   
   Maybe the original description by the author is somehow confusing.
   
   We do have original SQL (formal one) as a Tag in the Local Span so far, (`/ShardingSphere/ParseSQL`)
   
   ![image](https://user-images.githubusercontent.com/2568208/121163025-0713b300-c881-11eb-9fcb-c6dc12572511.png)
   
   The problem is that the LocalSpan (`/ShardingSphere/ParseSQL` in the figure) does not represent/carry any real execution information, for example, we cannot get real latency for this span and this also cannot be connected to the "MySQL/JDBC/Statement/executeQuery" span below.
   
   So the only issue we want to discuss here is whether it is possible to duplicate or move the tag (`db.statement` in `/ShardingSphere/ParseSQL`) to the `/ShardingSphere/executeSQL/` span.
   
   For the OAL and user-defined metrics, users can always write OAL to generate new metrics and add them to the Skywalking dashboard, right?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] lujiajing1126 edited a comment on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
lujiajing1126 edited a comment on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856629364


   > > In some of our business units, they are using Skywalking and want to have a dashboard for SQL(s) metrics.
   > 
   > But we don't have anything on upstream, so where could we start to discuss? If this is something not existing in open source, or accepted by the community.
   > 
   > > But traces cannot provide aggregational information like percentiles...
   > 
   > Again, no use case for upsteam. We can't discuss a thing not existing in public.
   
   Maybe the original description by the author is somehow confusing.
   
   We do have original SQL (formal one) as a Tag in the Local Span so far, (`/ShardingSphere/ParseSQL` below)
   
   ![image](https://user-images.githubusercontent.com/2568208/121163025-0713b300-c881-11eb-9fcb-c6dc12572511.png)
   
   The problem is that the LocalSpan (`/ShardingSphere/ParseSQL` in the figure) does not represent/carry any real execution information, for example, we cannot get real latency for this span and this also cannot be connected to the "MySQL/JDBC/Statement/executeQuery" span below.
   
   So the only issue we want to discuss here is whether it is possible to duplicate or move the tag (`db.statement` in `/ShardingSphere/ParseSQL`) to the `/ShardingSphere/executeSQL/` span.
   
   For the OAL and user-defined metrics, users can always write OAL to generate new metrics and add them to the Skywalking dashboard, right?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856607535


   > But it still may be helpful for better understanding the connection between the formal SQL and the actual SQL, since in the real application, the sharding rules can be extremely complex and cause high cardinality, right?
   
   That is ShardingSphere's community's decision. But, we never calculate metrics for SQL. Where is the requirement from?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856634219


   > The problem is that the LocalSpan (/ShardingSphere/ParseSQL in the figure) does not represent/carry any real execution information, for example, we cannot get real latency for this span and this also cannot be connected to the "MySQL/JDBC/Statement/executeQuery" span below.
   
   That is ShardingSphere's plugin issue, it doesn't end in async mode.
   
   > So the only issue we want to discuss here is whether it is possible to duplicate or move the tag (db.statement in /ShardingSphere/ParseSQL) to the /ShardingSphere/executeSQL/ span.
   
   What do you mean move? Whether they tag logic SQL statement is pure on plugin implementation level. and nothing related to metrics.
   
   > For the OAL and user-defined metrics, users can always write OAL to generate new metrics and add them to the Skywalking dashboard, right?
   
   This description is too general, I have to say yes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856583015


   I am highly doubting your statement. From a monitoring system, we treat the reality, rather than the logic concept. Sharding plugin and ShardingSphere are doing logic sharding, but from database perspective, it is facing more than one statement, which is SkyWalking's concern. We don't think this is an issue. This is expected and designed for


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng closed issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng closed issue #7082:
URL: https://github.com/apache/skywalking/issues/7082


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] lujiajing1126 edited a comment on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
lujiajing1126 edited a comment on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856629364


   > > In some of our business units, they are using Skywalking and want to have a dashboard for SQL(s) metrics.
   > 
   > But we don't have anything on upstream, so where could we start to discuss? If this is something not existing in open source, or accepted by the community.
   > 
   > > But traces cannot provide aggregational information like percentiles...
   > 
   > Again, no use case for upsteam. We can't discuss a thing not existing in public.
   
   Maybe the original description by the author is somehow confusing.
   
   We do have the original SQL (or the formal one) as a Tag in the Local Span so far, (`/ShardingSphere/ParseSQL` below)
   
   ![image](https://user-images.githubusercontent.com/2568208/121163025-0713b300-c881-11eb-9fcb-c6dc12572511.png)
   
   The problem is that the LocalSpan (`/ShardingSphere/ParseSQL` in the figure) does not represent/carry any real execution information, for example, we cannot get real latency for this span and this also cannot be connected to the "MySQL/JDBC/Statement/executeQuery" span below.
   
   So the only issue we want to discuss here is whether it is possible to duplicate or move the tag (`db.statement` in `/ShardingSphere/ParseSQL`) to the `/ShardingSphere/executeSQL/` span.
   
   For the OAL and user-defined metrics, users can always write OAL to generate new metrics and add them to the Skywalking dashboard, right?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856618478


   > In some of our business units, they are using Skywalking and want to have a dashboard for SQL(s) metrics.
   
   But we don't have anything on upstream, so where could we start to discuss? If this is something not existing in open source, or accepted by the community.
   
   > But traces cannot provide aggregational information like percentiles...
   
   Again, no use case for upsteam. We can't discuss a thing not existing in public.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] lujiajing1126 commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
lujiajing1126 commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856601319


   But it still may be helpful for better understanding the connection between the formal SQL and the actual SQL, since in the real application, the sharding rules can be extremely complex and cause high cardinality, right?
   
   For example, we can use OAL to calculate the metrics (Avg. Latency, Error Rate, etc.) for the formal SQL to understand the performance of SQL execution for some particular business logic. Due to fragmentations of actual SQLs, we are not able to have a general performance view at this level.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856608363


   > For example, we can use OAL to calculate the metrics (Avg. Latency, Error Rate, etc.) for the formal SQL to understand the performance of SQL execution for some particular business logic. Due to fragmentations of actual SQLs, we are not able to have a general performance view at this level.
   
   That is why trace exists, diagnosing performance issue as you described, is not from SQL's level metrics.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] wu-sheng commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
wu-sheng commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856635295


   From the naming perspecitve, `/ShardingSphere/ParseSQL` should not include execution phrase. If you have any doubt, submit a question to ShardingSphere community. The whole plugin was contributed by them, which from SkyWalking's perspective, they are the authority.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [skywalking] lujiajing1126 commented on issue #7082: Inaccurate database metrics due to Sharding JDBC plugin

Posted by GitBox <gi...@apache.org>.
lujiajing1126 commented on issue #7082:
URL: https://github.com/apache/skywalking/issues/7082#issuecomment-856616472


   > > But it still may be helpful for better understanding the connection between the formal SQL and the actual SQL, since in the real application, the sharding rules can be extremely complex and cause high cardinality, right?
   > 
   > That is ShardingSphere's community's decision. But, we never calculate metrics for SQL. Where is the requirement from?
   
   In some of our business units, they are using Skywalking and want to have a dashboard for SQL(s) metrics. Due to the high cardinality caused by multi-table JOIN (Cartesian product), it is not easy to check the general performance of SQL in some business logics.
   
   > That is why trace exists, diagnosing performance issue as you described, is not from SQL's level metrics.
   
   But traces cannot provide aggregational information like percentiles...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org