You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Tuo Zhu (JIRA)" <ji...@apache.org> on 2019/06/19 09:19:00 UTC
[jira] [Comment Edited] (KYLIN-3973)
InvalidProtocolBufferException: Protocol message was too large. May be
malicious.
[ https://issues.apache.org/jira/browse/KYLIN-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867393#comment-16867393 ]
Tuo Zhu edited comment on KYLIN-3973 at 6/19/19 9:18 AM:
---------------------------------------------------------
Same here. This happens when you have a relatively large cardinality normal dimension, and you query by a derived dimension belongs to it.
For example, we have an M_ID in our partitioned fact table (cardinality somewhere near 100000), and a N_ID in mapping table, cardinality less than 100. The M_ID belongs to N_ID. We have another derived column "Region", cardinality is 40.
When we query with "group by N_ID and Region" and "select count(distinct UID)" (hundreds of millions of different UID), this issue pops. I guess too much row are selected when you group by N_ID(many of N_IDs has thousands of M_ID), which causes this issue.
I guess we can probably increase
{code:java}
com.google.protobuf.CodedInputStream#DEFAULT_SIZE_LIMIT
{code} in protobuf library,
but a more sophisticated way is to use paging when fetching data from HBase?
was (Author: sickcate):
Same here. This happens when you have a relatively large cardinality normal dimension, and you query by a derived dimension belongs to it.
For example, we have an M_ID in our partitioned fact table (cardinality somewhere near 100000), and a N_ID in mapping table, cardinality less than 100. The M_ID belongs to N_ID. We have another derived column "Region", cardinality is 40. When we query with "group by N_ID and Region", this issue pops. I guess too much row are selected when you group by N_ID(many of N_IDs has thousands of M_ID), which causes this issue. Is there any way we can increase this limit?
> InvalidProtocolBufferException: Protocol message was too large. May be malicious.
> ----------------------------------------------------------------------------------
>
> Key: KYLIN-3973
> URL: https://issues.apache.org/jira/browse/KYLIN-3973
> Project: Kylin
> Issue Type: Bug
> Affects Versions: v2.6.1
> Reporter: Grzegorz KoĊakowski
> Priority: Major
>
> For many queries I receive the following exception.
> {noformat}
> 2019-04-23 11:33:15,576 WARN [kylin-coproc--pool6-t17] client.SyncCoprocessorRpcChannel:54 : Call failed on IOException
> com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large. May be malicious. Use CodedInputStream.setSizeLimit() to increase the size limit.
> at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
> at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
> at com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
> at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse.<init>(CubeVisitProtos.java:2307)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse.<init>(CubeVisitProtos.java:2271)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$1.parsePartialFrom(CubeVisitProtos.java:2380)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$1.parsePartialFrom(CubeVisitProtos.java:2375)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$Builder.mergeFrom(CubeVisitProtos.java:5101)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$Builder.mergeFrom(CubeVisitProtos.java:4949)
> at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
> at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
> at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
> at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
> at org.apache.hadoop.hbase.ipc.CoprocessorRpcUtils.getResponse(CoprocessorRpcUtils.java:141)
> at org.apache.hadoop.hbase.client.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:94)
> at org.apache.hadoop.hbase.client.SyncCoprocessorRpcChannel.callMethod(SyncCoprocessorRpcChannel.java:52)
> at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService$Stub.visitCube(CubeVisitProtos.java:5616)
> at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1$1.call(CubeHBaseEndpointRPC.java:246)
> at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1$1.call(CubeHBaseEndpointRPC.java:242)
> at org.apache.hadoop.hbase.client.HTable$12.call(HTable.java:1012)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I use lz4 compression algorithm in HBase.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)