You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ambition (JIRA)" <ji...@apache.org> on 2018/09/10 07:16:00 UTC
[jira] [Comment Edited] (FLINK-10299) RowSerializer.copy data value
cast exception and use checkpoint function Lead to Could not restart this
job
[ https://issues.apache.org/jira/browse/FLINK-10299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16608810#comment-16608810 ]
ambition edited comment on FLINK-10299 at 9/10/18 7:15 AM:
-----------------------------------------------------------
Sorry,The past two days are not workdays. I briefly describe the complete process.
Flink consuming Kafka captured user app data and some value is error,like "-". The sample data:
{code:java}
{"event_id": "10001","uid":"1561529398","timestamp": "1536288421", "viewport_height": "667","viewport_width": "375","language":"zh-CN"}
{"event_id": "1002","uid":"1561529398","timestamp": "-", "viewport_height": "667","viewport_width": "375","language":"zh-CN" }
{"event_id": "1003","uid":"1561529398","timestamp": "1536288421", "viewport_height": "667","viewport_width": "-" ,"language":"zh-CN"}
{code}
Flink Job code:
{code:java}
public class UserDataSQL {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment execEnv = StreamExecutionEnvironment.createLocalEnvironment();
execEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
execEnv.getCheckpointConfig().setCheckpointInterval(Long.valueOf(5000));
execEnv.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
execEnv.setRestartStrategy(RestartStrategies.fixedDelayRestart(3,10000));
FsStateBackend stateBackend = new FsStateBackend("hdfs:/flink/flink-checkpoints");
execEnv.setStateBackend(stateBackend);
StreamTableEnvironment env = StreamTableEnvironment.getTableEnvironment(execEnv);
Map<String,String> schemaMap = new LinkedHashMap<>();
schemaMap.put("event_id","Integer");
schemaMap.put("uid","Long");
schemaMap.put("timestamp","Timestamp");
schemaMap.put("viewport_height","Integer");
schemaMap.put("viewport_width","Integer");
schemaMap.put("language","String");
TableSchema tableSchema = new TableSchema(
schemaMap.keySet().toArray(new String[schemaMap.size()]),
schemaMap.values().toArray(new TypeInformation<?>[schemaMap.size()])
);
Properties kafkaProps = new Properties();
kafkaProps.setProperty("bootstrap.servers","xxx:9092");
kafkaProps.setProperty("topic","topic");
kafkaProps.setProperty("enable.auto.commit","true");
kafkaProps.setProperty("group.id","flink_group");
Kafka010JsonTableSource kafka010JsonTableSource = new Kafka010JsonTableSource("topic", kafkaProps, tableSchema, tableSchema);
kafka010JsonTableSource.setProctimeAttribute("timestamp");
env.registerTableSource("user_data",kafka010JsonTableSource);
env.registerTableSink("user_count",new MysqlTableUpsertSink());
env.sqlUpdate("inset into user_count select count(uid) as uv,event_id from user_data group by event_id");
execEnv.execute();
}
public static class MysqlTableUpsertSink implements UpsertStreamTableSink<Row> {
//omit other code
}
public static class UserData {
public Integer event_id;
public Long uid;
public Timestamp timestamp;
public Integer viewport_height;
public Integer viewport_width;
public String language;
//omit other code
}
{code}
Use checkpoint function,if data contains error value, job Shutting down, Could not restart this job.
Now have two ways can restart this job:
1. FsStateBackend on hdfs data deleted
2. error value set null, like I provide picture
Is the a batter way to record error data without affecting checkpoint function.
thanks
was (Author: ambition):
Sorry,The past two days are not workdays. I briefly describe the complete process.
Flink consuming Kafka captured user app data and some value is error,like "-". The sample data:
{code:java}
{"event_id": "10001","uid":"1561529398","timestamp": "1536288421", "viewport_height": "667","viewport_width": "375","language":"zh-CN"}
{"event_id": "1002","uid":"1561529398","timestamp": "-", "viewport_height": "667","viewport_width": "375","language":"zh-CN" }
{"event_id": "1003","uid":"1561529398","timestamp": "1536288421", "viewport_height": "667","viewport_width": "-" ,"language":"zh-CN"}
{code}
Flink Job code:
{code:java}
public class UserDataSQL {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment execEnv = StreamExecutionEnvironment.createLocalEnvironment();
execEnv.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
execEnv.getCheckpointConfig().setCheckpointInterval(Long.valueOf(5000));
execEnv.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
execEnv.setRestartStrategy(RestartStrategies.fixedDelayRestart(3,10000));
FsStateBackend stateBackend = new FsStateBackend("hdfs:/flink/flink-checkpoints");
execEnv.setStateBackend(stateBackend);
StreamTableEnvironment env = StreamTableEnvironment.getTableEnvironment(execEnv);
Map<String,String> schemaMap = new LinkedHashMap<>();
schemaMap.put("event_id","Integer");
schemaMap.put("uid","uid");
schemaMap.put("timestamp","Timestamp");
schemaMap.put("viewport_height","Integer");
schemaMap.put("viewport_width","Integer");
schemaMap.put("language","String");
TableSchema tableSchema = new TableSchema(
schemaMap.keySet().toArray(new String[schemaMap.size()]),
schemaMap.values().toArray(new TypeInformation<?>[schemaMap.size()])
);
Properties kafkaProps = new Properties();
kafkaProps.setProperty("bootstrap.servers","xxx:9092");
kafkaProps.setProperty("topic","topic");
kafkaProps.setProperty("enable.auto.commit","true");
kafkaProps.setProperty("group.id","flink_group");
Kafka010JsonTableSource kafka010JsonTableSource = new Kafka010JsonTableSource("topic", kafkaProps, tableSchema, tableSchema);
kafka010JsonTableSource.setProctimeAttribute("timestamp");
env.registerTableSource("user_data",kafka010JsonTableSource);
env.registerTableSink("user_count",new MysqlTableUpsertSink());
env.sqlUpdate("inset into user_count select count(uid) as uv,event_id from user_data group by event_id");
execEnv.execute();
}
public static class MysqlTableUpsertSink implements UpsertStreamTableSink<Row> {
//omit other code
}
public static class UserData {
public Integer event_id;
public Long uid;
public Timestamp timestamp;
public Integer viewport_height;
public Integer viewport_width;
public String language;
//omit other code
}
{code}
Use checkpoint function,if data contains error value, job Shutting down, Could not restart this job.
Now have two ways can restart this job:
1. FsStateBackend on hdfs data deleted
2. error value set null, like I provide picture
Is the a batter way to record error data without affecting checkpoint function.
thanks
> RowSerializer.copy data value cast exception and use checkpoint function Lead to Could not restart this job
> -----------------------------------------------------------------------------------------------------------
>
> Key: FLINK-10299
> URL: https://issues.apache.org/jira/browse/FLINK-10299
> Project: Flink
> Issue Type: Bug
> Components: Core
> Affects Versions: 1.6.0
> Reporter: ambition
> Priority: Minor
> Attachments: image-2018-09-07-17-47-04-343.png
>
>
> Flink sql deal with User behavior data collection, such as:
> {code:java}
> {
> "event_id": "session_start",
> "timestamp": "-", // error data,
> "viewport_height": "667",
> "viewport_width": "-" //error data
> }
> {code}
> Causing exception info :
> {code:java}
> 2018-09-07 10:47:01,834 [flink-akka.actor.default-dispatcher-2] INFO executiongraph.ExecutionGraph (ExecutionGraph.java:tryRestartOrFail(1511)) - Could not restart the job Flink Streaming Job (6f0248219c631158f6e38f2dca0beb91) because the restart strategy prevented it.
> java.lang.ClassCastException: java.lang.String cannot be cast to java.sql.Timestamp
> at org.apache.flink.api.common.typeutils.base.SqlTimestampSerializer.copy(SqlTimestampSerializer.java:27)
> at org.apache.flink.api.java.typeutils.runtime.RowSerializer.copy(RowSerializer.java:95)
> at org.apache.flink.api.java.typeutils.runtime.RowSerializer.copy(RowSerializer.java:46)
> at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.pushToOperator(OperatorChain.java:577)
> at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:554)
> at org.apache.flink.streaming.runtime.tasks.OperatorChain$CopyingChainingOutput.collect(OperatorChain.java:534)
> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:689)
> at org.apache.flink.streaming.api.operators.AbstractStreamOperator$CountingOutput.collect(AbstractStreamOperator.java:667)
> at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collect(StreamSourceContexts.java:104)
> at org.apache.flink.streaming.api.operators.StreamSourceContexts$NonTimestampContext.collectWithTimestamp(StreamSourceContexts.java:111)
> at org.apache.flink.streaming.connectors.kafka.internals.AbstractFetcher.emitRecordWithTimestamp(AbstractFetcher.java:398)
> at org.apache.flink.streaming.connectors.kafka.internal.Kafka010Fetcher.emitRecord(Kafka010Fetcher.java:89)
> at org.apache.flink.streaming.connectors.kafka.internal.Kafka09Fetcher.runFetchLoop(Kafka09Fetcher.java:154)
> at org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase.run(FlinkKafkaConsumerBase.java:738)
> at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:87)
> at org.apache.flink.streaming.api.operators.StreamSource.run(StreamSource.java:56)
> at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.run(SourceStreamTask.java:99)
> at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:300)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:711)
> at java.lang.Thread.run(Thread.java:748)
> 2018-09-07 10:47:01,834 [flink-akka.actor.default-dispatcher-2] INFO checkpoint.CheckpointCoordinator (CheckpointCoordinator.java:shutdown(320)) - Stopping checkpoint coordinator for job 6f0248219c631158f6e38f2dca0beb91.
> 2018-09-07 10:47:01,834 [flink-akka.actor.default-dispatcher-2] INFO checkpoint.StandaloneCompletedCheckpointStore (StandaloneCompletedCheckpointStore.java:shutdown(102)) - Shutting down
> {code}
> Use Flink checkpoint function and Uncatch exception lead to Could not restart this job, so just error data happen exception set null, like under image.hope flink commiter provide better solution。
> !image-2018-09-07-17-47-04-343.png!
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)