You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Brachi Packter (Jira)" <ji...@apache.org> on 2022/03/14 10:39:00 UTC
[jira] [Created] (BEAM-14094) Fix null pointer exception in HllCountInitFn
Brachi Packter created BEAM-14094:
-------------------------------------
Summary: Fix null pointer exception in HllCountInitFn
Key: BEAM-14094
URL: https://issues.apache.org/jira/browse/BEAM-14094
Project: Beam
Issue Type: Bug
Components: dsl-sql, extensions-java-sketching
Reporter: Brachi Packter
When trying to aggregate input with null value we will fail on null pointer exception, when we deal with high events rate we can have sometimes "broken" events and I don't want them the break all the pipline.
trace:
exception: "java.lang.NullPointerException
at com.google.zetasketch.HyperLogLogPlusPlus.add(HyperLogLogPlusPlus.java:212)
at org.apache.beam.sdk.extensions.zetasketch.HllCountInitFn$ForString.addInput(HllCountInitFn.java:147)
at org.apache.beam.sdk.extensions.zetasketch.HllCountInitFn$ForString.addInput(HllCountInitFn.java:137)
at org.apache.beam.sdk.extensions.sql.impl.transform.agg.AggregationCombineFnAdapter$WrappedCombinerBase.addInput(AggregationCombineFnAdapter.java:54)
at org.apache.beam.sdk.transforms.CombineFns$ComposedCombineFn.addInput(CombineFns.java:382)
at org.apache.beam.sdk.schemas.transforms.SchemaAggregateFn$Inner.addInput(SchemaAggregateFn.java:324)
at org.apache.beam.sdk.schemas.transforms.SchemaAggregateFn$Inner.addInput(SchemaAggregateFn.java:63)
at org.apache.beam.runners.dataflow.worker.WindmillStateInternals$WindmillCombiningState.add(WindmillStateInternals.java:2056)
at org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.SystemReduceFn.processValue(SystemReduceFn.java:119)
at org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElement(ReduceFnRunner.java:613)
at org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.ReduceFnRunner.processElements(ReduceFnRunner.java:360)
at org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:96)
at org.apache.beam.runners.dataflow.worker.StreamingGroupAlsoByWindowViaWindowSetFn.processElement(StreamingGroupAlsoByWindowViaWindowSetFn.java:43)
at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.invokeProcessElement(GroupAlsoByWindowFnRunner.java:121)
at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowFnRunner.processElement(GroupAlsoByWindowFnRunner.java:73)
at org.apache.beam.runners.dataflow.worker.repackaged.org.apache.beam.runners.core.LateDataDroppingDoFnRunner.processElement(LateDataDroppingDoFnRunner.java:80)
at org.apache.beam.runners.dataflow.worker.GroupAlsoByWindowsParDoFn.processElement(GroupAlsoByWindowsParDoFn.java:137)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ParDoOperation.process(ParDoOperation.java:44)
at org.apache.beam.runners.dataflow.worker.util.common.worker.OutputReceiver.process(OutputReceiver.java:49)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.runReadLoop(ReadOperation.java:212)
at org.apache.beam.runners.dataflow.worker.util.common.worker.ReadOperation.start(ReadOperation.java:163)
at org.apache.beam.runners.dataflow.worker.util.common.worker.MapTaskExecutor.execute(MapTaskExecutor.java:92)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.process(StreamingDataflowWorker.java:1437)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker.access$1100(StreamingDataflowWorker.java:165)
at org.apache.beam.runners.dataflow.worker.StreamingDataflowWorker$7.run(StreamingDataflowWorker.java:1113)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
--
This message was sent by Atlassian Jira
(v8.20.1#820001)