You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/10/24 06:21:06 UTC
[GitHub] [incubator-seatunnel] Carl-Zhou-CN opened a new issue, #3168: [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded
Carl-Zhou-CN opened a new issue, #3168:
URL: https://github.com/apache/incubator-seatunnel/issues/3168
### Search before asking
- [X] I had searched in the [issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
### What happened
When the written df is empty, the directory does not exist when the load file is loaded
### SeaTunnel Version
dev
### SeaTunnel Config
```conf
env {
spark.app.name = "SeaTunnel"
spark.executor.instances = 2
spark.executor.cores = 1
spark.executor.memory = "1g"
spark.master = local
}
source {
jdbc {
driver = com.clickhouse.jdbc.ClickHouseDriver
url = "jdbc:clickhouse://xxxxxxxxxxxxxxxxxxxxxxxxxxx",
table = "a"
result_table_name = "a"
user = "xxxxx"
password = "xxxxx"
}
}
transform {
sql {
sql= "select value_string ,arrayJoin(bitmapToArray(uid_bitmap)) uid from ads_22222222.user_tags_bitmap utb where name = 'user_tag_002' and value_string is not null"
}
}
sink {
Console {}
}
hbase {
source_table_name = "a"
hbase.zookeeper.quorum = "xxxxxxxxx"
catalog = "{\"table\":{\"namespace\":\"default\", \"name\":\"test1\"},\"rowkey\":\"col1\",\"columns\":{\"a\":{\"cf\":\"lab\", \"col\":\"a\", \"type\":\"string\"},\"uid\":{\"cf\":\"rowkey\", \"col\":\"key\", \"type\":\"string\"}}}"
staging_dir = "/tmp/hbase-staging/"
save_mode = "append"
}
```
### Running Command
```shell
./bin/start-seatunnel-spark.sh --master local[*] --deploy-mode client --config test2.conf
```
### Error Exception
```log
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.io.FileNotFoundException: File /tmp/hbase-staging/1666592258580 does not exist.
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:986)
at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:122)
at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1046)
at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1043)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1053)
at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:982)
at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:940)
at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.prepareHFileQueue(LoadIncrementalHFiles.java:224)
at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:331)
at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:256)
at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:132)
at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:41)
at org.apache.seatunnel.spark.SparkEnvironment.sinkProcess(SparkEnvironment.java:179)
at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:54)
at org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:67)
```
### Flink or Spark Version
_No response_
### Java or Scala Version
_No response_
### Screenshots
_No response_
### Are you willing to submit PR?
- [X] Yes I am willing to submit a PR!
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [incubator-seatunnel] TyrantLucifer closed issue #3168: [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded
Posted by GitBox <gi...@apache.org>.
TyrantLucifer closed issue #3168: [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded
URL: https://github.com/apache/incubator-seatunnel/issues/3168
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org