You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by Luan Cooper <gc...@gmail.com> on 2020/07/17 06:40:11 UTC

SQL 报错只有 flink runtime 的 NPE

Hi

我有这么一个 SQL
INSERT INTO es
SELECT
a,
udf_xxx(b)
FROM mongo_oplog -- 自定义 TableFactory

Job 提交后 fail 了,从 Job 提交到 Fail 只有一处来自非业务代码的 NPE 如下,没有任何业务代码 Exception,可以稳定重现

LUE _UTF-16LE'v2'))) AS return_received_time]) (1/1)
(bdf9b131f82a8ebc440165b82b89e570) switched from RUNNING to FAILED.

java.lang.NullPointerException

at StreamExecCalc$8016.split$7938$(Unknown Source)

at StreamExecCalc$8016.processElement(Unknown Source)

at
org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)

at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)

at
org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)

at
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)

at
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)

at
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)

at
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)

at
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)

at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)

at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)

at java.lang.Thread.run(Thread.java:748)

请问这种怎样情况排查问题?
有任何线索都可以

感谢

测试一下社区邮件

Posted by "sjlsumaitong@163.com" <sj...@163.com>.
忽略



sjlsumaitong@163.com
 

Re: SQL 报错只有 flink runtime 的 NPE

Posted by Jark Wu <im...@gmail.com>.
这个异常一般是由于 UDF 的实现用了主类型(int),但是实际的字段值有 null 值。
你可以试试先做个 where 条件过滤,将 null 值过滤掉?

Best,
Jark


On Mon, 20 Jul 2020 at 15:28, godfrey he <go...@gmail.com> wrote:

> 看不到图片信息,换一个图床工具上传图片吧
>
> Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午4:11写道:
>
> > 附一个 Job Graph 信息,在 Cal 处挂了
> > [image: image.png]
> >
> > On Fri, Jul 17, 2020 at 4:01 PM Luan Cooper <gc...@gmail.com> wrote:
> >
> >> 实际有 20 左右个字段,用到的 UDF 有 COALESCE / CAST / JSON_PATH / TIMESTAMP 类
> >> *是指 UDF 返回了 NULL 导致的吗?*
> >>
> >>
> >> On Fri, Jul 17, 2020 at 2:54 PM godfrey he <go...@gmail.com> wrote:
> >>
> >>> udf_xxx的逻辑是啥?
> >>>
> >>>
> >>> Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午2:40写道:
> >>>
> >>> > Hi
> >>> >
> >>> > 我有这么一个 SQL
> >>> > INSERT INTO es
> >>> > SELECT
> >>> > a,
> >>> > udf_xxx(b)
> >>> > FROM mongo_oplog -- 自定义 TableFactory
> >>> >
> >>> > Job 提交后 fail 了,从 Job 提交到 Fail 只有一处来自非业务代码的 NPE 如下,没有任何业务代码
> >>> Exception,可以稳定重现
> >>> >
> >>> > LUE _UTF-16LE'v2'))) AS return_received_time]) (1/1)
> >>> > (bdf9b131f82a8ebc440165b82b89e570) switched from RUNNING to FAILED.
> >>> >
> >>> > java.lang.NullPointerException
> >>> >
> >>> > at StreamExecCalc$8016.split$7938$(Unknown Source)
> >>> >
> >>> > at StreamExecCalc$8016.processElement(Unknown Source)
> >>> >
> >>> > at
> >>> >
> >>> >
> >>>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)
> >>> >
> >>> > at
> >>> > org.apache.flink.streaming.runtime.io
> >>> >
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)
> >>> >
> >>> > at
> >>> > org.apache.flink.streaming.runtime.io
> >>> > .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)
> >>> >
> >>> > at
> >>> > org.apache.flink.streaming.runtime.io
> >>> >
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
> >>> >
> >>> > at
> >>> >
> >>> >
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)
> >>> >
> >>> > at
> >>> >
> >>> >
> >>>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
> >>> >
> >>> > at
> >>> >
> >>> >
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)
> >>> >
> >>> > at
> >>> >
> >>> >
> >>>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)
> >>> >
> >>> > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
> >>> >
> >>> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
> >>> >
> >>> > at java.lang.Thread.run(Thread.java:748)
> >>> >
> >>> > 请问这种怎样情况排查问题?
> >>> > 有任何线索都可以
> >>> >
> >>> > 感谢
> >>> >
> >>>
> >>
>

Re: SQL 报错只有 flink runtime 的 NPE

Posted by godfrey he <go...@gmail.com>.
看不到图片信息,换一个图床工具上传图片吧

Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午4:11写道:

> 附一个 Job Graph 信息,在 Cal 处挂了
> [image: image.png]
>
> On Fri, Jul 17, 2020 at 4:01 PM Luan Cooper <gc...@gmail.com> wrote:
>
>> 实际有 20 左右个字段,用到的 UDF 有 COALESCE / CAST / JSON_PATH / TIMESTAMP 类
>> *是指 UDF 返回了 NULL 导致的吗?*
>>
>>
>> On Fri, Jul 17, 2020 at 2:54 PM godfrey he <go...@gmail.com> wrote:
>>
>>> udf_xxx的逻辑是啥?
>>>
>>>
>>> Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午2:40写道:
>>>
>>> > Hi
>>> >
>>> > 我有这么一个 SQL
>>> > INSERT INTO es
>>> > SELECT
>>> > a,
>>> > udf_xxx(b)
>>> > FROM mongo_oplog -- 自定义 TableFactory
>>> >
>>> > Job 提交后 fail 了,从 Job 提交到 Fail 只有一处来自非业务代码的 NPE 如下,没有任何业务代码
>>> Exception,可以稳定重现
>>> >
>>> > LUE _UTF-16LE'v2'))) AS return_received_time]) (1/1)
>>> > (bdf9b131f82a8ebc440165b82b89e570) switched from RUNNING to FAILED.
>>> >
>>> > java.lang.NullPointerException
>>> >
>>> > at StreamExecCalc$8016.split$7938$(Unknown Source)
>>> >
>>> > at StreamExecCalc$8016.processElement(Unknown Source)
>>> >
>>> > at
>>> >
>>> >
>>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)
>>> >
>>> > at
>>> > org.apache.flink.streaming.runtime.io
>>> > .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)
>>> >
>>> > at
>>> > org.apache.flink.streaming.runtime.io
>>> > .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)
>>> >
>>> > at
>>> > org.apache.flink.streaming.runtime.io
>>> > .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
>>> >
>>> > at
>>> >
>>> >
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)
>>> >
>>> > at
>>> >
>>> >
>>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
>>> >
>>> > at
>>> >
>>> >
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)
>>> >
>>> > at
>>> >
>>> >
>>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)
>>> >
>>> > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
>>> >
>>> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
>>> >
>>> > at java.lang.Thread.run(Thread.java:748)
>>> >
>>> > 请问这种怎样情况排查问题?
>>> > 有任何线索都可以
>>> >
>>> > 感谢
>>> >
>>>
>>

Re: SQL 报错只有 flink runtime 的 NPE

Posted by Luan Cooper <gc...@gmail.com>.
附一个 Job Graph 信息,在 Cal 处挂了
[image: image.png]

On Fri, Jul 17, 2020 at 4:01 PM Luan Cooper <gc...@gmail.com> wrote:

> 实际有 20 左右个字段,用到的 UDF 有 COALESCE / CAST / JSON_PATH / TIMESTAMP 类
> *是指 UDF 返回了 NULL 导致的吗?*
>
>
> On Fri, Jul 17, 2020 at 2:54 PM godfrey he <go...@gmail.com> wrote:
>
>> udf_xxx的逻辑是啥?
>>
>>
>> Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午2:40写道:
>>
>> > Hi
>> >
>> > 我有这么一个 SQL
>> > INSERT INTO es
>> > SELECT
>> > a,
>> > udf_xxx(b)
>> > FROM mongo_oplog -- 自定义 TableFactory
>> >
>> > Job 提交后 fail 了,从 Job 提交到 Fail 只有一处来自非业务代码的 NPE 如下,没有任何业务代码
>> Exception,可以稳定重现
>> >
>> > LUE _UTF-16LE'v2'))) AS return_received_time]) (1/1)
>> > (bdf9b131f82a8ebc440165b82b89e570) switched from RUNNING to FAILED.
>> >
>> > java.lang.NullPointerException
>> >
>> > at StreamExecCalc$8016.split$7938$(Unknown Source)
>> >
>> > at StreamExecCalc$8016.processElement(Unknown Source)
>> >
>> > at
>> >
>> >
>> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)
>> >
>> > at
>> > org.apache.flink.streaming.runtime.io
>> > .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)
>> >
>> > at
>> > org.apache.flink.streaming.runtime.io
>> > .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)
>> >
>> > at
>> > org.apache.flink.streaming.runtime.io
>> > .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
>> >
>> > at
>> >
>> >
>> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)
>> >
>> > at
>> >
>> >
>> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
>> >
>> > at
>> >
>> >
>> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)
>> >
>> > at
>> >
>> >
>> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)
>> >
>> > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
>> >
>> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
>> >
>> > at java.lang.Thread.run(Thread.java:748)
>> >
>> > 请问这种怎样情况排查问题?
>> > 有任何线索都可以
>> >
>> > 感谢
>> >
>>
>

Re: SQL 报错只有 flink runtime 的 NPE

Posted by Luan Cooper <gc...@gmail.com>.
实际有 20 左右个字段,用到的 UDF 有 COALESCE / CAST / JSON_PATH / TIMESTAMP 类
*是指 UDF 返回了 NULL 导致的吗?*


On Fri, Jul 17, 2020 at 2:54 PM godfrey he <go...@gmail.com> wrote:

> udf_xxx的逻辑是啥?
>
>
> Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午2:40写道:
>
> > Hi
> >
> > 我有这么一个 SQL
> > INSERT INTO es
> > SELECT
> > a,
> > udf_xxx(b)
> > FROM mongo_oplog -- 自定义 TableFactory
> >
> > Job 提交后 fail 了,从 Job 提交到 Fail 只有一处来自非业务代码的 NPE 如下,没有任何业务代码
> Exception,可以稳定重现
> >
> > LUE _UTF-16LE'v2'))) AS return_received_time]) (1/1)
> > (bdf9b131f82a8ebc440165b82b89e570) switched from RUNNING to FAILED.
> >
> > java.lang.NullPointerException
> >
> > at StreamExecCalc$8016.split$7938$(Unknown Source)
> >
> > at StreamExecCalc$8016.processElement(Unknown Source)
> >
> > at
> >
> >
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)
> >
> > at
> > org.apache.flink.streaming.runtime.io
> > .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)
> >
> > at
> > org.apache.flink.streaming.runtime.io
> > .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)
> >
> > at
> > org.apache.flink.streaming.runtime.io
> > .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
> >
> > at
> >
> >
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)
> >
> > at
> >
> >
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
> >
> > at
> >
> >
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)
> >
> > at
> >
> >
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)
> >
> > at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
> >
> > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
> >
> > at java.lang.Thread.run(Thread.java:748)
> >
> > 请问这种怎样情况排查问题?
> > 有任何线索都可以
> >
> > 感谢
> >
>

Re: SQL 报错只有 flink runtime 的 NPE

Posted by godfrey he <go...@gmail.com>.
udf_xxx的逻辑是啥?


Luan Cooper <gc...@gmail.com> 于2020年7月17日周五 下午2:40写道:

> Hi
>
> 我有这么一个 SQL
> INSERT INTO es
> SELECT
> a,
> udf_xxx(b)
> FROM mongo_oplog -- 自定义 TableFactory
>
> Job 提交后 fail 了,从 Job 提交到 Fail 只有一处来自非业务代码的 NPE 如下,没有任何业务代码 Exception,可以稳定重现
>
> LUE _UTF-16LE'v2'))) AS return_received_time]) (1/1)
> (bdf9b131f82a8ebc440165b82b89e570) switched from RUNNING to FAILED.
>
> java.lang.NullPointerException
>
> at StreamExecCalc$8016.split$7938$(Unknown Source)
>
> at StreamExecCalc$8016.processElement(Unknown Source)
>
> at
>
> org.apache.flink.streaming.runtime.tasks.OneInputStreamTask$StreamTaskNetworkOutput.emitRecord(OneInputStreamTask.java:173)
>
> at
> org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.processElement(StreamTaskNetworkInput.java:151)
>
> at
> org.apache.flink.streaming.runtime.io
> .StreamTaskNetworkInput.emitNext(StreamTaskNetworkInput.java:128)
>
> at
> org.apache.flink.streaming.runtime.io
> .StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:69)
>
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:311)
>
> at
>
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:187)
>
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:487)
>
> at
>
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:470)
>
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:707)
>
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:532)
>
> at java.lang.Thread.run(Thread.java:748)
>
> 请问这种怎样情况排查问题?
> 有任何线索都可以
>
> 感谢
>