You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by "gaotong521 (via GitHub)" <gi...@apache.org> on 2024/04/12 03:36:47 UTC
[I] When the hive table storage type is orc, data sinks to the hive, and the task fails to be executed [seatunnel]
gaotong521 opened a new issue, #6694:
URL: https://github.com/apache/seatunnel/issues/6694
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
### What happened
When the hive table storage type is orc, data sinks to the hive and the FieldMapper transform is configured. If certain fields in the hive table are not mapped, tasks fail to be executed
### SeaTunnel Version
2.3.4
### SeaTunnel Config
```conf
{
"env": {
"parallelism": 3,
"job.mode": "BATCH",
"checkpoint.interval": 30000,
"job.name": "seatunnel_1712823979630"
},
"source": [
{
"plugin_name": "Jdbc",
"result_table_name": "table_source",
"user": "postgres",
"password": "C3kk4v5_b4f2Jr",
"driver": "org.postgresql.Driver",
"url": "jdbc:postgresql://10.188.15.91:5434/gis",
"query": "select event_id,event_type,event_radius,event_source,start_time,end_time,priority,latitude,longitude,elevation,node_ids,create_time,update_time from ghcloud.gh_traffic_event_info"
}
],
"transform": [
{
"plugin_name": "FieldMapper",
"source_table_name": "table_source",
"result_table_name": "table_source_FieldMapper",
"field_mapper": {
"event_id": "event_id",
"event_type": "event_type",
"event_radius": "event_radius",
"event_source": "event_source",
"start_time": "start_time",
"end_time": "end_time",
"priority": "priority",
"latitude": "latitude",
"longitude": "longitude",
"elevation": "elevation",
"node_ids": "node_ids",
"create_time": "create_time",
"update_time": "update_time"
}
}
],
"sink": [
{
"plugin_name": "Hive",
"source_table_name": "table_source_FieldMapper",
"table_name": "gh_cloud_data_model.dwd_pub_traffic_event",
"metastore_uri": "thrift://cloudera-hadoop-61:9083"
}
]
}
```
### Running Command
```shell
Executed by dolphin scheduler
```
### Error Exception
```log
SHUTDOWN
2024-04-12 11:31:30,246 INFO [s.c.s.s.c.ClientExecuteCommand] [main] - Closed SeaTunnel client......
2024-04-12 11:31:30,246 INFO [s.c.s.s.c.ClientExecuteCommand] [main] - Closed metrics executor service ......
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] -
===============================================================================
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Fatal Error,
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Please submit bug report in https://github.com/apache/seatunnel/issues
2024-04-12 11:31:30,246 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Reason:SeaTunnel job executed failed
2024-04-12 11:31:30,248 ERROR [o.a.s.c.s.SeaTunnel ] [main] - Exception StackTrace:org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:257)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:75)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:648)
at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:949)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildSchemaWithRowType(OrcWriteStrategy.java:196)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:116)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:75)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:134)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:46)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:247)
... 16 more
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
... 2 more
2024-04-12 11:31:30,248 ERROR [o.a.s.c.s.SeaTunnel ] [main] -
===============================================================================
Exception in thread "main" org.apache.seatunnel.core.starter.exception.CommandExecuteException: SeaTunnel job executed failed
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:202)
at org.apache.seatunnel.core.starter.SeaTunnel.run(SeaTunnel.java:40)
at org.apache.seatunnel.core.starter.seatunnel.SeaTunnelClient.main(SeaTunnelClient.java:34)
Caused by: org.apache.seatunnel.engine.common.exception.SeaTunnelEngineException: java.lang.RuntimeException: java.lang.NullPointerException
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:257)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:66)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:39)
at org.apache.seatunnel.engine.server.task.SeaTunnelTransformCollector.collect(SeaTunnelTransformCollector.java:27)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.handleRecord(IntermediateBlockingQueue.java:75)
at org.apache.seatunnel.engine.server.task.group.queue.IntermediateBlockingQueue.collect(IntermediateBlockingQueue.java:50)
at org.apache.seatunnel.engine.server.task.flow.IntermediateQueueFlowLifeCycle.collect(IntermediateQueueFlowLifeCycle.java:51)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.collect(TransformSeaTunnelTask.java:73)
at org.apache.seatunnel.engine.server.task.SeaTunnelTask.stateProcess(SeaTunnelTask.java:168)
at org.apache.seatunnel.engine.server.task.TransformSeaTunnelTask.call(TransformSeaTunnelTask.java:78)
at org.apache.seatunnel.engine.server.TaskExecutionService$BlockingWorker.run(TaskExecutionService.java:648)
at org.apache.seatunnel.engine.server.TaskExecutionService$NamedTaskWrapper.run(TaskExecutionService.java:949)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.buildSchemaWithRowType(OrcWriteStrategy.java:196)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.getOrCreateWriter(OrcWriteStrategy.java:116)
at org.apache.seatunnel.connectors.seatunnel.file.sink.writer.OrcWriteStrategy.write(OrcWriteStrategy.java:75)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:134)
at org.apache.seatunnel.connectors.seatunnel.file.sink.BaseFileSinkWriter.write(BaseFileSinkWriter.java:46)
at org.apache.seatunnel.engine.server.task.flow.SinkFlowLifeCycle.received(SinkFlowLifeCycle.java:247)
... 16 more
at org.apache.seatunnel.core.starter.seatunnel.command.ClientExecuteCommand.execute(ClientExecuteCommand.java:194)
... 2 more
2024-04-12 11:31:30,249 INFO [s.c.s.s.c.ClientExecuteCommand] [ForkJoinPool.commonPool-worker-2] - run shutdown hook because get close signal
[INFO] 2024-04-12 11:31:30.453 +0800 - FINALIZE_SESSION
```
### Zeta or Flink or Spark Version
_No response_
### Java or Scala Version
_No response_
### Screenshots
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] When the hive table storage type is orc, data sinks to the hive, and the task fails to be executed [seatunnel]
Posted by "LeonYoah (via GitHub)" <gi...@apache.org>.
LeonYoah commented on issue #6694:
URL: https://github.com/apache/seatunnel/issues/6694#issuecomment-2060432346
Please paste in the ddl statement of the [gh_cloud_data_model.dwd_pub_traffic_event table]. It is suspected that the name of the mapped field is inconsistent with that of the destination table, which causes the null pointer problem
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
Re: [I] When the hive table storage type is orc, data sinks to the hive, and the task fails to be executed [seatunnel]
Posted by "LeonYoah (via GitHub)" <gi...@apache.org>.
LeonYoah commented on issue #6694:
URL: https://github.com/apache/seatunnel/issues/6694#issuecomment-2060771448
You should pay attention to two things: one is that all fields in the [hive] table should have corresponding fields from upstream. If there are no extra fields upstream, you can pass the empty string, that is, [''], as an empty field, but you cannot specify [null] as an empty field, and the field mapping name should be the same as the field name in the table.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org