You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by 叶贤勋 <yx...@163.com> on 2020/07/13 12:34:22 UTC
使用Flink Array Field Type
Flink 1.10.0
问题描述:source表中有个test_array_string ARRAY<VARCHAR>字段,在DML语句用test_array_string[0]获取数组中的值会报数组越界异常。另外测试过Array<Varchar>也是相同错误,Array<int>,Array<bigint>等类型也会报数组越界问题。
请问这是Flink1.10的bug吗?
SQL:
CREATETABLE source (
……
test_array_string ARRAY<VARCHAR>
) WITH (
'connector.type'='kafka',
'update-mode'='append',
'format.type'='json'
……
);
CREATETABLE sink(
v_string string
) WITH (
……
);
INSERTINTO
sink
SELECT
test_array_string[0] as v_string
from
source;
kafka样例数据:{"id":1,"test_array_string":["ff”]}
Flink 执行的时候报以下错误:
java.lang.ArrayIndexOutOfBoundsException: 33554432
at org.apache.flink.table.runtime.util.SegmentsUtil.getByteMultiSegments(SegmentsUtil.java:598)
at org.apache.flink.table.runtime.util.SegmentsUtil.getByte(SegmentsUtil.java:590)
at org.apache.flink.table.runtime.util.SegmentsUtil.bitGet(SegmentsUtil.java:534)
at org.apache.flink.table.dataformat.BinaryArray.isNullAt(BinaryArray.java:117)
at StreamExecCalc$9.processElement(UnknownSource)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:641)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:616)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:596)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
at SourceConversion$1.processElement(UnknownSource)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:641)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:616)
at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:596)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:730)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:708)
at org.apache.flink.streaming.api.operators.StreamSourceContexts$ManualWatermarkContext.processAndCollectWithTimestamp(StreamSourceContexts.java:310)
at org.apache.flink.streaming.api.operators.StreamSourceContexts$WatermarkContext.collectWithTimestamp(StreamSourceContexts.java:409)
at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordWithTimestamp(AbstractFetcher.java:408)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.emitRecord(KafkaFetcher.java:185)
at org.apache.flink.streaming.connectors.kafka.internal.KafkaFetcher.runFetchLoop(KafkaFetcher.java:150)
at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:715)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:100)
at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:63)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask$LegacySourceFunctionThread.run(SourceStreamTask.java:196)
| |
叶贤勋
|
|
yxx_cmhd@163.com
|
签名由网易邮箱大师定制
回复: 使用Flink Array Field Type
Posted by 叶贤勋 <yx...@163.com>.
谢谢 Leonard的解答。刚刚也看到了这个jira单[1]
[1] https://issues.apache.org/jira/browse/FLINK-17847
| |
叶贤勋
|
|
yxx_cmhd@163.com
|
签名由网易邮箱大师定制
在2020年07月13日 20:50,Leonard Xu<xb...@gmail.com> 写道:
Hi,
SQL 中数据下标是从1开始的,不是从0,所以会有数组越界问题。建议使用数组时通过 select arr[5] from T where CARDINALITY(arr) >= 5 这种方式防止数组访问越界。
祝好,
Leonard Xu
在 2020年7月13日,20:34,叶贤勋 <yx...@163.com> 写道:
test_array_string[0]
Re: 使用Flink Array Field Type
Posted by Leonard Xu <xb...@gmail.com>.
Hi,
SQL 中数据下标是从1开始的,不是从0,所以会有数组越界问题。建议使用数组时通过 select arr[5] from T where CARDINALITY(arr) >= 5 这种方式防止数组访问越界。
祝好,
Leonard Xu
> 在 2020年7月13日,20:34,叶贤勋 <yx...@163.com> 写道:
>
> test_array_string[0]