You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/10/30 04:08:00 UTC

[jira] [Commented] (PHOENIX-5940) Pre-4.15 client cannot connect to 4.15+ server after SYSTEM.CATALOG region has split

    [ https://issues.apache.org/jira/browse/PHOENIX-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17223399#comment-17223399 ] 

ASF GitHub Bot commented on PHOENIX-5940:
-----------------------------------------

yanxinyi opened a new pull request #950:
URL: https://github.com/apache/phoenix/pull/950


   …TEM.CATALOG region has split


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Pre-4.15 client cannot connect to 4.15+ server after SYSTEM.CATALOG region has split
> ------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5940
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.3
>            Reporter: Chinmay Kulkarni
>            Assignee: Xinyi Yan
>            Priority: Blocker
>             Fix For: 4.16.0
>
>
> Steps to repro:
>  # Start the server with 4.15 or 4.16-SNAPSHOT (head of 4.x) with the default setting for splitting SYSTEM.CATALOG i.e. _phoenix.system.catalog.splittable=true_
>  # Connect with a 4.15+ client and create enough tables/views/indices to cause the SYSTEM.CATALOG region to split (you may want to set the following server-side configs for a quicker repro:
>  ## _hbase.hregion.memstore.flush.size=1048576_ i.e. 1MB (to flush memstores quicker),
>  ## _hbase.hregion.max.filesize=2097152_ i.e. 2MB (so we don've have to load too much data to cause a region split),
>  ## _hbase.zookeeper.property.maxClientCnxns=-1_ (If we don’t set this, the default limit is easily hit when creating hundreds of views),
>  ## _hbase.table.sanity.checks=false_ (otherwise HBase complains that the HRegion max file size config is too small).
>  # With these configs, I've found that creating ~4000 views is sufficient to cause the SYSTEM.CATALOG region to split.
>  # Now connect with any pre-4.15 client like 4.14.3. Getting a connection will fail with the following stack trace:
> {noformat}
> Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException): org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> 	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	... 9 more
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1275)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:35542)
> 	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1702)
> 	... 13 more
> 20/06/04 19:14:18 WARN client.HTable: Error calling coprocessor service org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for row
> java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> 	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	... 9 more
> 	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> 	at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1775)
> 	at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1731)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1350)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1239)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1576)
> 	at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2731)
> 	at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1115)
> 	at org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
> 	at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:410)
> 	at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:393)
> 	at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:392)
> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:380)
> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1810)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2623)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2586)
> 	at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2586)
> 	at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
> 	at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
> 	at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
> 	at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
> 	at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
> 	at sqlline.Commands.connect(Commands.java:1064)
> 	at sqlline.Commands.connect(Commands.java:996)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:809)
> 	at sqlline.SqlLine.initArgs(SqlLine.java:588)
> 	at sqlline.SqlLine.begin(SqlLine.java:661)
> 	at sqlline.SqlLine.start(SqlLine.java:398)
> 	at sqlline.SqlLine.main(SqlLine.java:291)
> {noformat}
> RS logs for the region throwing the error:
> {noformat}
> 2020-06-04 19:14:18,655 ERROR [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] coprocessor.MetaDataEndpointImpl: loading system catalog table inside getVersion failed
> java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> 2020-06-04 19:14:18,656 DEBUG [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704: callId: 7 service: ClientService methodName: ExecService size: 131 connection: 10.3.4.181:57305
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> 	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	... 9 more
> {noformat}
> The reason why this happens is that in a pre-4.15 client, inside CQSI.checkClientServerCompatibility, the getVersion method is invoked on MetaDataEndpointImpl over *all SYSTEM.CATALOG regions* (we pass in null for startKey and endKey), see [this|https://github.com/apache/phoenix/blob/e2993552dc88cb7fc59fc0dfdaa2876ac260886c/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1350].
> Inside MetaDataEndpointImpl#getVersion, we [call doGetTable|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L3224]
>  Now, if SYSTEM.CATALOG has split, this call will also be invoked on a region that does not contain the header row for SYSTEM.CATALOG causing it to fail in MetaDataEndpointImpl#doGetTable [here|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2928-L2933].
> This is avoided in 4.15+ clients since we have restricted the getVersion invocation to the region containing the header row for SYSTEM.CATALOG (see [this|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1517]).
> We need to add a special condition to consider pre-4.15 clients before propagating the error back to clients inside MetaDataEndpointImpl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)