You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Xinyi Yan (Jira)" <ji...@apache.org> on 2020/11/07 21:02:00 UTC

[jira] [Updated] (PHOENIX-5940) Pre-4.15 client cannot connect to 4.15+ server after SYSTEM.CATALOG region has split

     [ https://issues.apache.org/jira/browse/PHOENIX-5940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xinyi Yan updated PHOENIX-5940:
-------------------------------
    Attachment: PHOENIX-5940.4.x.v1.patch

> Pre-4.15 client cannot connect to 4.15+ server after SYSTEM.CATALOG region has split
> ------------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5940
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5940
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.14.3
>            Reporter: Chinmay Kulkarni
>            Assignee: Xinyi Yan
>            Priority: Blocker
>             Fix For: 4.16.0
>
>         Attachments: PHOENIX-5940.4.x.v1.patch
>
>
> Steps to repro:
>  # Start the server with 4.15 or 4.16-SNAPSHOT (head of 4.x) with the default setting for splitting SYSTEM.CATALOG i.e. _phoenix.system.catalog.splittable=true_
>  # Connect with a 4.15+ client and create enough tables/views/indices to cause the SYSTEM.CATALOG region to split (you may want to set the following server-side configs for a quicker repro:
>  ## _hbase.hregion.memstore.flush.size=1048576_ i.e. 1MB (to flush memstores quicker),
>  ## _hbase.hregion.max.filesize=2097152_ i.e. 2MB (so we don've have to load too much data to cause a region split),
>  ## _hbase.zookeeper.property.maxClientCnxns=-1_ (If we don’t set this, the default limit is easily hit when creating hundreds of views),
>  ## _hbase.table.sanity.checks=false_ (otherwise HBase complains that the HRegion max file size config is too small).
>  # With these configs, I've found that creating ~4000 views is sufficient to cause the SYSTEM.CATALOG region to split.
>  # Now connect with any pre-4.15 client like 4.14.3. Getting a connection will fail with the following stack trace:
> {noformat}
> Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException): org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> 	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	... 9 more
> 	at org.apache.hadoop.hbase.ipc.RpcClientImpl.call(RpcClientImpl.java:1275)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:227)
> 	at org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:336)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$BlockingStub.execService(ClientProtos.java:35542)
> 	at org.apache.hadoop.hbase.protobuf.ProtobufUtil.execService(ProtobufUtil.java:1702)
> 	... 13 more
> 20/06/04 19:14:18 WARN client.HTable: Error calling coprocessor service org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for row
> java.util.concurrent.ExecutionException: org.apache.hadoop.hbase.DoNotRetryIOException: org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> 	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	... 9 more
> 	at java.util.concurrent.FutureTask.report(FutureTask.java:122)
> 	at java.util.concurrent.FutureTask.get(FutureTask.java:192)
> 	at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1775)
> 	at org.apache.hadoop.hbase.client.HTable.coprocessorService(HTable.java:1731)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:1350)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1239)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1576)
> 	at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2731)
> 	at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1115)
> 	at org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
> 	at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:410)
> 	at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:393)
> 	at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:392)
> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:380)
> 	at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1810)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2623)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2586)
> 	at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
> 	at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2586)
> 	at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
> 	at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:144)
> 	at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
> 	at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
> 	at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
> 	at sqlline.Commands.connect(Commands.java:1064)
> 	at sqlline.Commands.connect(Commands.java:996)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:38)
> 	at sqlline.SqlLine.dispatch(SqlLine.java:809)
> 	at sqlline.SqlLine.initArgs(SqlLine.java:588)
> 	at sqlline.SqlLine.begin(SqlLine.java:661)
> 	at sqlline.SqlLine.start(SqlLine.java:398)
> 	at sqlline.SqlLine.main(SqlLine.java:291)
> {noformat}
> RS logs for the region throwing the error:
> {noformat}
> 2020-06-04 19:14:18,655 ERROR [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] coprocessor.MetaDataEndpointImpl: loading system catalog table inside getVersion failed
> java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> 2020-06-04 19:14:18,656 DEBUG [RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704] ipc.RpcServer: RpcServer.FifoWFPBQ.default.handler=29,queue=2,port=56704: callId: 7 service: ClientService methodName: ExecService size: 131 connection: 10.3.4.181:57305
> org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2013 (INT15): ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG SYSTEM.CATALOG
> 	at org.apache.phoenix.util.ServerUtil.createIOException(ServerUtil.java:114)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3214)
> 	at org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:17268)
> 	at org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8338)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2170)
> 	at org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2152)
> 	at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:35076)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2394)
> 	at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:123)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:188)
> 	at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:168)
> Caused by: java.sql.SQLException: ERROR 2013 (INT15): MetadataEndpointImpl doGetTable called for table not present on region tableName=SYSTEM.CATALOG
> 	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:575)
> 	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:195)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2916)
> 	at org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getVersion(MetaDataEndpointImpl.java:3208)
> 	... 9 more
> {noformat}
> The reason why this happens is that in a pre-4.15 client, inside CQSI.checkClientServerCompatibility, the getVersion method is invoked on MetaDataEndpointImpl over *all SYSTEM.CATALOG regions* (we pass in null for startKey and endKey), see [this|https://github.com/apache/phoenix/blob/e2993552dc88cb7fc59fc0dfdaa2876ac260886c/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1350].
> Inside MetaDataEndpointImpl#getVersion, we [call doGetTable|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L3224]
>  Now, if SYSTEM.CATALOG has split, this call will also be invoked on a region that does not contain the header row for SYSTEM.CATALOG causing it to fail in MetaDataEndpointImpl#doGetTable [here|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/coprocessor/MetaDataEndpointImpl.java#L2928-L2933].
> This is avoided in 4.15+ clients since we have restricted the getVersion invocation to the region containing the header row for SYSTEM.CATALOG (see [this|https://github.com/apache/phoenix/blob/77c6cb32fce04b912b7c502dc170d86af8293fe6/phoenix-core/src/main/java/org/apache/phoenix/query/ConnectionQueryServicesImpl.java#L1517]).
> We need to add a special condition to consider pre-4.15 clients before propagating the error back to clients inside MetaDataEndpointImpl.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)