You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kylin.apache.org by "Tuo Zhu (JIRA)" <ji...@apache.org> on 2019/06/19 09:19:00 UTC

[jira] [Comment Edited] (KYLIN-3973) InvalidProtocolBufferException: Protocol message was too large. May be malicious.

    [ https://issues.apache.org/jira/browse/KYLIN-3973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16867393#comment-16867393 ] 

Tuo Zhu edited comment on KYLIN-3973 at 6/19/19 9:18 AM:
---------------------------------------------------------

Same here. This happens when you have a relatively large cardinality normal dimension, and you query by a derived dimension belongs to it.

For example, we have an M_ID in our partitioned fact table (cardinality somewhere near 100000), and a N_ID in mapping table, cardinality less than 100. The M_ID belongs to N_ID. We have another derived column "Region", cardinality is 40. 

When we query with "group by N_ID and Region" and "select count(distinct UID)" (hundreds of millions of different UID), this issue pops. I guess too much row are selected when you group by N_ID(many of N_IDs has thousands of M_ID), which causes this issue. 

I guess we can probably increase 
{code:java}
com.google.protobuf.CodedInputStream#DEFAULT_SIZE_LIMIT
{code} in protobuf library, 
but a more sophisticated way is to use paging when fetching data from HBase?





was (Author: sickcate):
Same here. This happens when you have a relatively large cardinality normal dimension, and you query by a derived dimension belongs to it.

For example, we have an M_ID in our partitioned fact table (cardinality somewhere near 100000), and a N_ID in mapping table, cardinality less than 100. The M_ID belongs to N_ID. We have another derived column "Region", cardinality is 40. When we query with "group by N_ID and Region", this  issue pops. I guess too much row are selected when you group by N_ID(many of N_IDs has thousands of M_ID), which causes this issue. Is there any way we can increase this limit?

> InvalidProtocolBufferException: Protocol message was too large.  May be malicious.
> ----------------------------------------------------------------------------------
>
>                 Key: KYLIN-3973
>                 URL: https://issues.apache.org/jira/browse/KYLIN-3973
>             Project: Kylin
>          Issue Type: Bug
>    Affects Versions: v2.6.1
>            Reporter: Grzegorz KoĊ‚akowski
>            Priority: Major
>
> For many queries I receive the following exception.
> {noformat}
> 2019-04-23 11:33:15,576 WARN  [kylin-coproc--pool6-t17] client.SyncCoprocessorRpcChannel:54 : Call failed on IOException
> com.google.protobuf.InvalidProtocolBufferException: Protocol message was too large.  May be malicious.  Use CodedInputStream.setSizeLimit() to increase the size limit.
>         at com.google.protobuf.InvalidProtocolBufferException.sizeLimitExceeded(InvalidProtocolBufferException.java:110)
>         at com.google.protobuf.CodedInputStream.refillBuffer(CodedInputStream.java:755)
>         at com.google.protobuf.CodedInputStream.isAtEnd(CodedInputStream.java:701)
>         at com.google.protobuf.CodedInputStream.readTag(CodedInputStream.java:99)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse.<init>(CubeVisitProtos.java:2307)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse.<init>(CubeVisitProtos.java:2271)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$1.parsePartialFrom(CubeVisitProtos.java:2380)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$1.parsePartialFrom(CubeVisitProtos.java:2375)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$Builder.mergeFrom(CubeVisitProtos.java:5101)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitResponse$Builder.mergeFrom(CubeVisitProtos.java:4949)
>         at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:337)
>         at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
>         at com.google.protobuf.AbstractMessageLite$Builder.mergeFrom(AbstractMessageLite.java:210)
>         at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:904)
>         at com.google.protobuf.AbstractMessage$Builder.mergeFrom(AbstractMessage.java:267)
>         at org.apache.hadoop.hbase.ipc.CoprocessorRpcUtils.getResponse(CoprocessorRpcUtils.java:141)
>         at org.apache.hadoop.hbase.client.RegionCoprocessorRpcChannel.callExecService(RegionCoprocessorRpcChannel.java:94)
>         at org.apache.hadoop.hbase.client.SyncCoprocessorRpcChannel.callMethod(SyncCoprocessorRpcChannel.java:52)
>         at org.apache.kylin.storage.hbase.cube.v2.coprocessor.endpoint.generated.CubeVisitProtos$CubeVisitService$Stub.visitCube(CubeVisitProtos.java:5616)
>         at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1$1.call(CubeHBaseEndpointRPC.java:246)
>         at org.apache.kylin.storage.hbase.cube.v2.CubeHBaseEndpointRPC$1$1.call(CubeHBaseEndpointRPC.java:242)
>         at org.apache.hadoop.hbase.client.HTable$12.call(HTable.java:1012)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> I use lz4 compression algorithm in HBase.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)