You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/08/17 15:02:41 UTC
[GitHub] [incubator-seatunnel] Bingz2 opened a new issue, #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE
Bingz2 opened a new issue, #2449:
URL: https://github.com/apache/incubator-seatunnel/issues/2449
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
### What happened
Error running Spark Connector V2 example using IDE
### SeaTunnel Version
dev
### SeaTunnel Config
```conf
env {
# You can set spark configuration here
# see available properties defined by spark: https://spark.apache.org/docs/latest/configuration.html#available-properties
#job.mode = BATCH
spark.app.name = "SeaTunnel"
spark.executor.instances = 2
spark.executor.cores = 1
spark.executor.memory = "1g"
spark.master = local
}
source {
# This is a example input plugin **only for test and demonstrate the feature input plugin**
FakeSource {
result_table_name = "fake"
field_name = "name,age,timestamp"
}
# You can also use other input plugins, such as hdfs
# hdfs {
# result_table_name = "accesslog"
# path = "hdfs://hadoop-cluster-01/nginx/accesslog"
# format = "json"
# }
# If you would like to get more information about how to configure seatunnel and see full list of input plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/source-plugins/Fake
}
transform {
# split data by specific delimiter
# you can also use other transform plugins, such as sql
sql {
sql = "select name,age from fake"
result_table_name = "sql"
}
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/transform-plugins/Split
}
sink {
# choose stdout output plugin to output data to console
LocalFile {
format = "orc"
path = "D:/workspace/test/st"
file_name_expression = "orc"
}
# you can also you other output plugins, such as sql
# hdfs {
# path = "hdfs://hadoop-cluster-01/nginx/accesslog_processed"
# save_mode = "append"
# }
# If you would like to get more information about how to configure seatunnel and see full list of output plugins,
# please go to https://seatunnel.apache.org/docs/spark/configuration/sink-plugins/Console
}
```
### Running Command
```shell
Run the Spark Connector v2 Example using a local IDE
```
### Error Exception
```log
22/08/17 22:49:03 INFO Executor: Starting executor ID driver on host localhost
22/08/17 22:49:03 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 53914.
22/08/17 22:49:03 INFO NettyBlockTransferService: Server created on GITV:53914
22/08/17 22:49:03 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
22/08/17 22:49:03 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, GITV, 53914, None)
22/08/17 22:49:04 INFO BlockManagerMasterEndpoint: Registering block manager GITV:53914 with 1965.3 MB RAM, BlockManagerId(driver, GITV, 53914, None)
22/08/17 22:49:04 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, GITV, 53914, None)
22/08/17 22:49:04 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, GITV, 53914, None)
22/08/17 22:49:04 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6f95cd51{/metrics/json,null,AVAILABLE,@Spark}
22/08/17 22:49:04 WARN StreamingContext: spark.master should be set as local[n], n > 1 in local mode if you have receivers to get data, otherwise Spark jobs will not get resources to process the received data.
22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load SeaTunnelSource Plugin from D:\workspace\idea\seatunnel\incubator-seatunnel\seatunnel-common\connectors\seatunnel
22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load plugin: PluginIdentifier{engineType='seatunnel', pluginType='source', pluginName='FakeSource'} from classpath
22/08/17 22:49:04 INFO SparkEnvironment: register plugins :[]
22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load BaseSparkTransform Plugin from D:\workspace\idea\seatunnel\incubator-seatunnel\seatunnel-common\connectors\seatunnel
22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load plugin: PluginIdentifier{engineType='seatunnel', pluginType='transform', pluginName='sql'} from classpath
22/08/17 22:49:04 INFO SparkEnvironment: register plugins :[]
22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load SeaTunnelSink Plugin from D:\workspace\idea\seatunnel\incubator-seatunnel\seatunnel-common\connectors\seatunnel
22/08/17 22:49:04 INFO AbstractPluginDiscovery: Load plugin: PluginIdentifier{engineType='seatunnel', pluginType='sink', pluginName='LocalFile'} from classpath
22/08/17 22:49:04 INFO SparkEnvironment: register plugins :[]
22/08/17 22:49:04 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/D:/workspace/idea/seatunnel/incubator-seatunnel/spark-warehouse/').
22/08/17 22:49:04 INFO SharedState: Warehouse path is 'file:/D:/workspace/idea/seatunnel/incubator-seatunnel/spark-warehouse/'.
22/08/17 22:49:04 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7c8326a4{/SQL,null,AVAILABLE,@Spark}
22/08/17 22:49:04 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@77128dab{/SQL/json,null,AVAILABLE,@Spark}
22/08/17 22:49:04 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6f012914{/SQL/execution,null,AVAILABLE,@Spark}
22/08/17 22:49:04 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@18fdb6cf{/SQL/execution/json,null,AVAILABLE,@Spark}
22/08/17 22:49:04 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@720653c2{/static/sql,null,AVAILABLE,@Spark}
22/08/17 22:49:05 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
22/08/17 22:49:07 ERROR SparkApiTaskExecuteCommand: Run SeaTunnel on spark failed.
java.lang.RuntimeException: file_name_expression must contains transactionId when is_enable_transaction is true
at org.apache.seatunnel.connectors.seatunnel.file.sink.config.TextFileSinkConfig.<init>(TextFileSinkConfig.java:112)
at org.apache.seatunnel.connectors.seatunnel.file.sink.AbstractFileSink.getSinkConfig(AbstractFileSink.java:143)
at org.apache.seatunnel.connectors.seatunnel.file.sink.AbstractFileSink.createAggregatedCommitter(AbstractFileSink.java:114)
at org.apache.seatunnel.translation.spark.sink.SparkDataSourceWriter.<init>(SparkDataSourceWriter.java:48)
at org.apache.seatunnel.translation.spark.sink.SparkSink.createWriter(SparkSink.java:67)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:254)
at org.apache.seatunnel.core.starter.spark.execution.SinkExecuteProcessor.execute(SinkExecuteProcessor.java:75)
at org.apache.seatunnel.core.starter.spark.execution.SparkExecution.execute(SparkExecution.java:60)
at org.apache.seatunnel.core.starter.spark.command.SparkApiTaskExecuteCommand.execute(SparkApiTaskExecuteCommand.java:54)
at org.apache.seatunnel.core.starter.Seatunnel.run(Seatunnel.java:40)
at org.apache.seatunnel.example.spark.v2.ExampleUtils.builder(ExampleUtils.java:43)
at org.apache.seatunnel.example.spark.v2.SeaTunnelApiExample.main(SeaTunnelApiExample.java:28)
22/08/17 22:49:07 INFO SparkContext: Invoking stop() from shutdown hook
22/08/17 22:49:07 INFO AbstractConnector: Stopped Spark@9bd0fa6{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
22/08/17 22:49:07 INFO SparkUI: Stopped Spark web UI at http://GITV:4040
22/08/17 22:49:07 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/08/17 22:49:07 INFO MemoryStore: MemoryStore cleared
22/08/17 22:49:07 INFO BlockManager: BlockManager stopped
22/08/17 22:49:07 INFO BlockManagerMaster: BlockManagerMaster stopped
22/08/17 22:49:07 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/08/17 22:49:07 INFO SparkContext: Successfully stopped SparkContext
22/08/17 22:49:07 INFO ShutdownHookManager: Shutdown hook called
```
### Flink or Spark Version
_No response_
### Java or Scala Version
_No response_
### Screenshots
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-seatunnel] EricJoy2048 commented on issue #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE
Posted by GitBox <gi...@apache.org>.
EricJoy2048 commented on issue #2449:
URL: https://github.com/apache/incubator-seatunnel/issues/2449#issuecomment-1219059425
`java.lang.RuntimeException: file_name_expression must contains transactionId when is_enable_transaction is true`
The `file_name_expression` is not a required parameter, the default value is `${transactionId}`. But if you want set `file_name_expression` value, the value must contain the `${transactionId}` part.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-seatunnel] TyrantLucifer commented on issue #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE
Posted by GitBox <gi...@apache.org>.
TyrantLucifer commented on issue #2449:
URL: https://github.com/apache/incubator-seatunnel/issues/2449#issuecomment-1218309875
Please refer to these demo config files and change your config of local source.
https://github.com/apache/incubator-seatunnel/tree/dev/seatunnel-e2e%2Fseatunnel-spark-connector-v2-e2e%2Fsrc%2Ftest%2Fresources%2Ffile
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-seatunnel] Bingz2 commented on issue #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE
Posted by GitBox <gi...@apache.org>.
Bingz2 commented on issue #2449:
URL: https://github.com/apache/incubator-seatunnel/issues/2449#issuecomment-1219117745
According to the example configuration in e2e, it can be modified to run normally,Thank you so much!
```
sink {
# choose stdout output plugin to output data to console
LocalFile {
path="D:/workspace/test/st"
partition_by=["age"]
partition_dir_expression="${k0}=${v0}"
is_partition_field_write_in_file=true
file_name_expression="${transactionId}_${now}"
file_format="orc"
filename_time_format="yyyy.MM.dd"
is_enable_transaction=true
save_mode="error"
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-seatunnel] Bingz2 closed issue #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE
Posted by GitBox <gi...@apache.org>.
Bingz2 closed issue #2449: [Bug] [Connector-V2][File Local Sink] Error running Spark Connector V2 example using IDE
URL: https://github.com/apache/incubator-seatunnel/issues/2449
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org