You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@shardingsphere.apache.org by Daniel Yu <yu...@outlook.com> on 2020/01/20 03:16:53 UTC

use sharding-proxy, query is slow

Hi :
     we have  a table 'member' with 160,000,000 rows ,

      sharding key is member_id
     after sharding the database to 16 databases, the 'member' 's table struct is same in 16 databases,
one database the 'member' table have 10,000,000 rows

     we   query  data by SQL "select * from member where member_id = 'abc' "  on   database 1 ,it cost 8s,


   when i used proxy query on  16 databases by the same sql ,  it  cost  60s ,  as far as I know the proxy used threadpool to Parallel execution the query , but  the query is slow , why?



Re:Re: Re: use sharding-proxy, query is slow

Posted by KimmKing <ki...@apache.org>.
Hi, Daniel,
IMO, SQL like your case will be routed to any one database, then the performance with sharding-proxy is almost equals with direct jdbc.
I guess 4 point maybe cause this issue:
1. system load and instances, just like yonglun talk about.
2. network issues, such as DNS etc.
3. router algorithm, like your own ComplexAlorithm.
4. a corner-case bug  in proxy, and we never meet it.
So, further more, can you submit more details for tuning and shooting it.
1. how many instances and their load in query time
2. there were network issues or not, you can use ip address instead of domain name or hostname and try again. Another way is just executing a simple sql, such as "select database()" or "show tables" via sharding-proxy and watch how long it return results
3. please check your own ComplexAlorithm or show us the codes
4. find the actual sql in proxy console and then run a sql-cli in the same machine to execute the actual sql here, watch how long it spend


Then we can profile this issue details to know what it happens and how to solve it.


At 2020-01-20 12:44:04, "Zhang Yonglun" <zh...@apache.org> wrote:
>Can you tell me  how many hosts or how may MySQL instance there are first?
>I think the performance is not about the server.yaml configuration.
>
>--
>
>Zhang Yonglun
>Apache ShardingSphere
>
>
>Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 下午12:39写道:
>
>> ok, we will checkout the env load,it may be some performance problem
>>
>> another question,we have no experience on  the  server.yaml and
>> Performance Tuning about this config,
>> can you give some advice as you see in our server.yaml?
>>
>>
>> ________________________________
>> yukai596@outlook.com
>>
>> From: Zhang Yonglun<ma...@apache.org>
>> Date: 2020-01-20 12:24
>> To: dev<ma...@shardingsphere.apache.org>
>> Subject: Re: Re: use sharding-proxy, query is slow
>> The hostname xxx you  omitted is what I want to see. Just tell me how many
>> hosts or how may MySQL instance there are.
>> Show me the 'load' on any host of MySQL instance when query. If a query
>> with key costs 8s, the load will be very high.
>>
>> --
>>
>> Zhang Yonglun
>> Apache ShardingSphere
>>
>>
>> Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 下午12:14写道:
>>
>> > Hi:
>> >    thx for you reply.
>> >
>> > 1. What's the cost when mysql client query database1 directly without
>> > sharding-proxy? (use member_id as where condition)
>> > it cost 8 seconds
>> > the cost means the SQL query time spent,  i use the command like " mysql
>> > -u -p   xxx" on mysql server, and i run the sql ,it print the cost
>> result.
>> >
>> > 2. Is 'member_id' the primary key?
>> > yes , the member_id is primary key ,  before sharding it is primary key,
>> > after sharding to 16 databases,  it is still  primary key .
>> >
>> > 3. Please show me the config-sharding.yaml and server.yaml.
>> > I omitted some cotent
>> >
>> > the config-sharding.yaml  :
>> > dataSources:
>> >   db0:
>> >     url: jdbc:mysql://xxx:3306/db0?useSSL=false
>> >     username: xxx
>> >     password: xxx
>> >     connectionTimeoutMilliseconds: 30000
>> >     idleTimeoutMilliseconds: 60000
>> >     maxLifetimeMilliseconds: 1800000
>> >     maxPoolSize: 65
>> > ...
>> >   db15:
>> >     url: jdbc:mysql://xxx:3306/db1?useSSL=false
>> >     username: root
>> >     password: root
>> >     connectionTimeoutMilliseconds: 30000
>> >     idleTimeoutMilliseconds: 60000
>> >     maxLifetimeMilliseconds: 1800000
>> >     maxPoolSize: 65
>> >
>> > shardingRule:
>> >   tables:
>> >     member:
>> >       actualDataNodes: db${0..15}.member
>> >       databaseStrategy:
>> >         complex:
>> >           shardingColumns: member_id
>> >           algorithmClassName:
>> > com.toonyoo.shardingjdbc.algorithm.ComplexAlorithm
>> >
>> > the server.yaml:
>> > authentication:
>> >   users:
>> >     root:
>> >       password: xxx
>> > props:
>> >   max.connections.size.per.query: 32
>> >   acceptor.size: 32
>> >   executor.size: 32
>> >   proxy.frontend.flush.threshold: 256
>> >   sql.show: false
>> > ________________________________
>> > yukai596@outlook.com
>> >
>> > From: Zhang Yonglun<ma...@apache.org>
>> > Date: 2020-01-20 11:51
>> > To: dev<ma...@shardingsphere.apache.org>
>> > Subject: Re: use sharding-proxy, query is slow
>> > 1. What's the cost when mysql client query database1 directly without
>> > sharding-proxy? (use member_id as where condition)
>> > 2. Is 'member_id' the primary key?
>> > 3. Please show me the config-sharding.yaml and server.yaml.
>> >
>> > --
>> >
>> > Zhang Yonglun
>> > Apache ShardingSphere
>> >
>> >
>> > Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 上午11:17写道:
>> >
>> > > Hi :
>> > >      we have  a table 'member' with 160,000,000 rows ,
>> > >
>> > >       sharding key is member_id
>> > >      after sharding the database to 16 databases, the 'member' 's table
>> > > struct is same in 16 databases,
>> > > one database the 'member' table have 10,000,000 rows
>> > >
>> > >      we   query  data by SQL "select * from member where member_id =
>> > 'abc'
>> > > "  on   database 1 ,it cost 8s,
>> > >
>> > >
>> > >    when i used proxy query on  16 databases by the same sql ,  it  cost
>> > > 60s ,  as far as I know the proxy used threadpool to Parallel execution
>> > the
>> > > query , but  the query is slow , why?
>> > >
>> > >
>> > >
>> >
>>

Re: Re: use sharding-proxy, query is slow

Posted by Zhang Yonglun <zh...@apache.org>.
Can you tell me  how many hosts or how may MySQL instance there are first?
I think the performance is not about the server.yaml configuration.

--

Zhang Yonglun
Apache ShardingSphere


Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 下午12:39写道:

> ok, we will checkout the env load,it may be some performance problem
>
> another question,we have no experience on  the  server.yaml and
> Performance Tuning about this config,
> can you give some advice as you see in our server.yaml?
>
>
> ________________________________
> yukai596@outlook.com
>
> From: Zhang Yonglun<ma...@apache.org>
> Date: 2020-01-20 12:24
> To: dev<ma...@shardingsphere.apache.org>
> Subject: Re: Re: use sharding-proxy, query is slow
> The hostname xxx you  omitted is what I want to see. Just tell me how many
> hosts or how may MySQL instance there are.
> Show me the 'load' on any host of MySQL instance when query. If a query
> with key costs 8s, the load will be very high.
>
> --
>
> Zhang Yonglun
> Apache ShardingSphere
>
>
> Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 下午12:14写道:
>
> > Hi:
> >    thx for you reply.
> >
> > 1. What's the cost when mysql client query database1 directly without
> > sharding-proxy? (use member_id as where condition)
> > it cost 8 seconds
> > the cost means the SQL query time spent,  i use the command like " mysql
> > -u -p   xxx" on mysql server, and i run the sql ,it print the cost
> result.
> >
> > 2. Is 'member_id' the primary key?
> > yes , the member_id is primary key ,  before sharding it is primary key,
> > after sharding to 16 databases,  it is still  primary key .
> >
> > 3. Please show me the config-sharding.yaml and server.yaml.
> > I omitted some cotent
> >
> > the config-sharding.yaml  :
> > dataSources:
> >   db0:
> >     url: jdbc:mysql://xxx:3306/db0?useSSL=false
> >     username: xxx
> >     password: xxx
> >     connectionTimeoutMilliseconds: 30000
> >     idleTimeoutMilliseconds: 60000
> >     maxLifetimeMilliseconds: 1800000
> >     maxPoolSize: 65
> > ...
> >   db15:
> >     url: jdbc:mysql://xxx:3306/db1?useSSL=false
> >     username: root
> >     password: root
> >     connectionTimeoutMilliseconds: 30000
> >     idleTimeoutMilliseconds: 60000
> >     maxLifetimeMilliseconds: 1800000
> >     maxPoolSize: 65
> >
> > shardingRule:
> >   tables:
> >     member:
> >       actualDataNodes: db${0..15}.member
> >       databaseStrategy:
> >         complex:
> >           shardingColumns: member_id
> >           algorithmClassName:
> > com.toonyoo.shardingjdbc.algorithm.ComplexAlorithm
> >
> > the server.yaml:
> > authentication:
> >   users:
> >     root:
> >       password: xxx
> > props:
> >   max.connections.size.per.query: 32
> >   acceptor.size: 32
> >   executor.size: 32
> >   proxy.frontend.flush.threshold: 256
> >   sql.show: false
> > ________________________________
> > yukai596@outlook.com
> >
> > From: Zhang Yonglun<ma...@apache.org>
> > Date: 2020-01-20 11:51
> > To: dev<ma...@shardingsphere.apache.org>
> > Subject: Re: use sharding-proxy, query is slow
> > 1. What's the cost when mysql client query database1 directly without
> > sharding-proxy? (use member_id as where condition)
> > 2. Is 'member_id' the primary key?
> > 3. Please show me the config-sharding.yaml and server.yaml.
> >
> > --
> >
> > Zhang Yonglun
> > Apache ShardingSphere
> >
> >
> > Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 上午11:17写道:
> >
> > > Hi :
> > >      we have  a table 'member' with 160,000,000 rows ,
> > >
> > >       sharding key is member_id
> > >      after sharding the database to 16 databases, the 'member' 's table
> > > struct is same in 16 databases,
> > > one database the 'member' table have 10,000,000 rows
> > >
> > >      we   query  data by SQL "select * from member where member_id =
> > 'abc'
> > > "  on   database 1 ,it cost 8s,
> > >
> > >
> > >    when i used proxy query on  16 databases by the same sql ,  it  cost
> > > 60s ,  as far as I know the proxy used threadpool to Parallel execution
> > the
> > > query , but  the query is slow , why?
> > >
> > >
> > >
> >
>

Re: Re: use sharding-proxy, query is slow

Posted by Daniel Yu <yu...@outlook.com>.
ok, we will checkout the env load,it may be some performance problem

another question,we have no experience on  the  server.yaml and Performance Tuning about this config,
can you give some advice as you see in our server.yaml?


________________________________
yukai596@outlook.com

From: Zhang Yonglun<ma...@apache.org>
Date: 2020-01-20 12:24
To: dev<ma...@shardingsphere.apache.org>
Subject: Re: Re: use sharding-proxy, query is slow
The hostname xxx you  omitted is what I want to see. Just tell me how many
hosts or how may MySQL instance there are.
Show me the 'load' on any host of MySQL instance when query. If a query
with key costs 8s, the load will be very high.

--

Zhang Yonglun
Apache ShardingSphere


Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 下午12:14写道:

> Hi:
>    thx for you reply.
>
> 1. What's the cost when mysql client query database1 directly without
> sharding-proxy? (use member_id as where condition)
> it cost 8 seconds
> the cost means the SQL query time spent,  i use the command like " mysql
> -u -p   xxx" on mysql server, and i run the sql ,it print the cost result.
>
> 2. Is 'member_id' the primary key?
> yes , the member_id is primary key ,  before sharding it is primary key,
> after sharding to 16 databases,  it is still  primary key .
>
> 3. Please show me the config-sharding.yaml and server.yaml.
> I omitted some cotent
>
> the config-sharding.yaml  :
> dataSources:
>   db0:
>     url: jdbc:mysql://xxx:3306/db0?useSSL=false
>     username: xxx
>     password: xxx
>     connectionTimeoutMilliseconds: 30000
>     idleTimeoutMilliseconds: 60000
>     maxLifetimeMilliseconds: 1800000
>     maxPoolSize: 65
> ...
>   db15:
>     url: jdbc:mysql://xxx:3306/db1?useSSL=false
>     username: root
>     password: root
>     connectionTimeoutMilliseconds: 30000
>     idleTimeoutMilliseconds: 60000
>     maxLifetimeMilliseconds: 1800000
>     maxPoolSize: 65
>
> shardingRule:
>   tables:
>     member:
>       actualDataNodes: db${0..15}.member
>       databaseStrategy:
>         complex:
>           shardingColumns: member_id
>           algorithmClassName:
> com.toonyoo.shardingjdbc.algorithm.ComplexAlorithm
>
> the server.yaml:
> authentication:
>   users:
>     root:
>       password: xxx
> props:
>   max.connections.size.per.query: 32
>   acceptor.size: 32
>   executor.size: 32
>   proxy.frontend.flush.threshold: 256
>   sql.show: false
> ________________________________
> yukai596@outlook.com
>
> From: Zhang Yonglun<ma...@apache.org>
> Date: 2020-01-20 11:51
> To: dev<ma...@shardingsphere.apache.org>
> Subject: Re: use sharding-proxy, query is slow
> 1. What's the cost when mysql client query database1 directly without
> sharding-proxy? (use member_id as where condition)
> 2. Is 'member_id' the primary key?
> 3. Please show me the config-sharding.yaml and server.yaml.
>
> --
>
> Zhang Yonglun
> Apache ShardingSphere
>
>
> Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 上午11:17写道:
>
> > Hi :
> >      we have  a table 'member' with 160,000,000 rows ,
> >
> >       sharding key is member_id
> >      after sharding the database to 16 databases, the 'member' 's table
> > struct is same in 16 databases,
> > one database the 'member' table have 10,000,000 rows
> >
> >      we   query  data by SQL "select * from member where member_id =
> 'abc'
> > "  on   database 1 ,it cost 8s,
> >
> >
> >    when i used proxy query on  16 databases by the same sql ,  it  cost
> > 60s ,  as far as I know the proxy used threadpool to Parallel execution
> the
> > query , but  the query is slow , why?
> >
> >
> >
>

Re: Re: use sharding-proxy, query is slow

Posted by Zhang Yonglun <zh...@apache.org>.
The hostname xxx you  omitted is what I want to see. Just tell me how many
hosts or how may MySQL instance there are.
Show me the 'load' on any host of MySQL instance when query. If a query
with key costs 8s, the load will be very high.

--

Zhang Yonglun
Apache ShardingSphere


Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 下午12:14写道:

> Hi:
>    thx for you reply.
>
> 1. What's the cost when mysql client query database1 directly without
> sharding-proxy? (use member_id as where condition)
> it cost 8 seconds
> the cost means the SQL query time spent,  i use the command like " mysql
> -u -p   xxx" on mysql server, and i run the sql ,it print the cost result.
>
> 2. Is 'member_id' the primary key?
> yes , the member_id is primary key ,  before sharding it is primary key,
> after sharding to 16 databases,  it is still  primary key .
>
> 3. Please show me the config-sharding.yaml and server.yaml.
> I omitted some cotent
>
> the config-sharding.yaml  :
> dataSources:
>   db0:
>     url: jdbc:mysql://xxx:3306/db0?useSSL=false
>     username: xxx
>     password: xxx
>     connectionTimeoutMilliseconds: 30000
>     idleTimeoutMilliseconds: 60000
>     maxLifetimeMilliseconds: 1800000
>     maxPoolSize: 65
> ...
>   db15:
>     url: jdbc:mysql://xxx:3306/db1?useSSL=false
>     username: root
>     password: root
>     connectionTimeoutMilliseconds: 30000
>     idleTimeoutMilliseconds: 60000
>     maxLifetimeMilliseconds: 1800000
>     maxPoolSize: 65
>
> shardingRule:
>   tables:
>     member:
>       actualDataNodes: db${0..15}.member
>       databaseStrategy:
>         complex:
>           shardingColumns: member_id
>           algorithmClassName:
> com.toonyoo.shardingjdbc.algorithm.ComplexAlorithm
>
> the server.yaml:
> authentication:
>   users:
>     root:
>       password: xxx
> props:
>   max.connections.size.per.query: 32
>   acceptor.size: 32
>   executor.size: 32
>   proxy.frontend.flush.threshold: 256
>   sql.show: false
> ________________________________
> yukai596@outlook.com
>
> From: Zhang Yonglun<ma...@apache.org>
> Date: 2020-01-20 11:51
> To: dev<ma...@shardingsphere.apache.org>
> Subject: Re: use sharding-proxy, query is slow
> 1. What's the cost when mysql client query database1 directly without
> sharding-proxy? (use member_id as where condition)
> 2. Is 'member_id' the primary key?
> 3. Please show me the config-sharding.yaml and server.yaml.
>
> --
>
> Zhang Yonglun
> Apache ShardingSphere
>
>
> Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 上午11:17写道:
>
> > Hi :
> >      we have  a table 'member' with 160,000,000 rows ,
> >
> >       sharding key is member_id
> >      after sharding the database to 16 databases, the 'member' 's table
> > struct is same in 16 databases,
> > one database the 'member' table have 10,000,000 rows
> >
> >      we   query  data by SQL "select * from member where member_id =
> 'abc'
> > "  on   database 1 ,it cost 8s,
> >
> >
> >    when i used proxy query on  16 databases by the same sql ,  it  cost
> > 60s ,  as far as I know the proxy used threadpool to Parallel execution
> the
> > query , but  the query is slow , why?
> >
> >
> >
>

Re: Re: use sharding-proxy, query is slow

Posted by Daniel Yu <yu...@outlook.com>.
Hi:
   thx for you reply.

1. What's the cost when mysql client query database1 directly without
sharding-proxy? (use member_id as where condition)
it cost 8 seconds
the cost means the SQL query time spent,  i use the command like " mysql  -u -p   xxx" on mysql server, and i run the sql ,it print the cost result.

2. Is 'member_id' the primary key?
yes , the member_id is primary key ,  before sharding it is primary key, after sharding to 16 databases,  it is still  primary key .

3. Please show me the config-sharding.yaml and server.yaml.
I omitted some cotent

the config-sharding.yaml  :
dataSources:
  db0:
    url: jdbc:mysql://xxx:3306/db0?useSSL=false
    username: xxx
    password: xxx
    connectionTimeoutMilliseconds: 30000
    idleTimeoutMilliseconds: 60000
    maxLifetimeMilliseconds: 1800000
    maxPoolSize: 65
...
  db15:
    url: jdbc:mysql://xxx:3306/db1?useSSL=false
    username: root
    password: root
    connectionTimeoutMilliseconds: 30000
    idleTimeoutMilliseconds: 60000
    maxLifetimeMilliseconds: 1800000
    maxPoolSize: 65

shardingRule:
  tables:
    member:
      actualDataNodes: db${0..15}.member
      databaseStrategy:
        complex:
          shardingColumns: member_id
          algorithmClassName: com.toonyoo.shardingjdbc.algorithm.ComplexAlorithm

the server.yaml:
authentication:
  users:
    root:
      password: xxx
props:
  max.connections.size.per.query: 32
  acceptor.size: 32
  executor.size: 32
  proxy.frontend.flush.threshold: 256
  sql.show: false
________________________________
yukai596@outlook.com

From: Zhang Yonglun<ma...@apache.org>
Date: 2020-01-20 11:51
To: dev<ma...@shardingsphere.apache.org>
Subject: Re: use sharding-proxy, query is slow
1. What's the cost when mysql client query database1 directly without
sharding-proxy? (use member_id as where condition)
2. Is 'member_id' the primary key?
3. Please show me the config-sharding.yaml and server.yaml.

--

Zhang Yonglun
Apache ShardingSphere


Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 上午11:17写道:

> Hi :
>      we have  a table 'member' with 160,000,000 rows ,
>
>       sharding key is member_id
>      after sharding the database to 16 databases, the 'member' 's table
> struct is same in 16 databases,
> one database the 'member' table have 10,000,000 rows
>
>      we   query  data by SQL "select * from member where member_id = 'abc'
> "  on   database 1 ,it cost 8s,
>
>
>    when i used proxy query on  16 databases by the same sql ,  it  cost
> 60s ,  as far as I know the proxy used threadpool to Parallel execution the
> query , but  the query is slow , why?
>
>
>

Re: use sharding-proxy, query is slow

Posted by Zhang Yonglun <zh...@apache.org>.
1. What's the cost when mysql client query database1 directly without
sharding-proxy? (use member_id as where condition)
2. Is 'member_id' the primary key?
3. Please show me the config-sharding.yaml and server.yaml.

--

Zhang Yonglun
Apache ShardingSphere


Daniel Yu <yu...@outlook.com> 于2020年1月20日周一 上午11:17写道:

> Hi :
>      we have  a table 'member' with 160,000,000 rows ,
>
>       sharding key is member_id
>      after sharding the database to 16 databases, the 'member' 's table
> struct is same in 16 databases,
> one database the 'member' table have 10,000,000 rows
>
>      we   query  data by SQL "select * from member where member_id = 'abc'
> "  on   database 1 ,it cost 8s,
>
>
>    when i used proxy query on  16 databases by the same sql ,  it  cost
> 60s ,  as far as I know the proxy used threadpool to Parallel execution the
> query , but  the query is slow , why?
>
>
>