You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@seatunnel.apache.org by GitBox <gi...@apache.org> on 2022/10/24 06:21:06 UTC

[GitHub] [incubator-seatunnel] Carl-Zhou-CN opened a new issue, #3168: [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded

Carl-Zhou-CN opened a new issue, #3168:
URL: https://github.com/apache/incubator-seatunnel/issues/3168

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues.
   
   
   ### What happened
   
   When the written df is empty, the directory does not exist when the load file is loaded
   
   ### SeaTunnel Version
   
   dev
   
   ### SeaTunnel Config
   
   ```conf
   env {
     spark.app.name = "SeaTunnel"
     spark.executor.instances = 2
     spark.executor.cores = 1
     spark.executor.memory = "1g"
     spark.master = local
   }
   
   source {
     jdbc {
         driver = com.clickhouse.jdbc.ClickHouseDriver
         url = "jdbc:clickhouse://xxxxxxxxxxxxxxxxxxxxxxxxxxx",
         table = "a"
         result_table_name = "a"
         user = "xxxxx"
         password = "xxxxx"
     }
   }
   
   transform {
   sql {
     sql= "select value_string ,arrayJoin(bitmapToArray(uid_bitmap)) uid from  ads_22222222.user_tags_bitmap utb where name = 'user_tag_002' and value_string is not null"
   }
   }
   
   
   sink {
     Console {}
   }
   
    hbase {
       source_table_name = "a"
       hbase.zookeeper.quorum = "xxxxxxxxx"
       catalog = "{\"table\":{\"namespace\":\"default\", \"name\":\"test1\"},\"rowkey\":\"col1\",\"columns\":{\"a\":{\"cf\":\"lab\", \"col\":\"a\", \"type\":\"string\"},\"uid\":{\"cf\":\"rowkey\", \"col\":\"key\", \"type\":\"string\"}}}"
       staging_dir = "/tmp/hbase-staging/"
       save_mode = "append"
   }
   ```
   
   
   ### Running Command
   
   ```shell
   ./bin/start-seatunnel-spark.sh --master local[*] --deploy-mode client --config test2.conf
   ```
   
   
   ### Error Exception
   
   ```log
   at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
   Caused by: java.io.FileNotFoundException: File /tmp/hbase-staging/1666592258580 does not exist.
   	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:986)
   	at org.apache.hadoop.hdfs.DistributedFileSystem.access$1000(DistributedFileSystem.java:122)
   	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1046)
   	at org.apache.hadoop.hdfs.DistributedFileSystem$24.doCall(DistributedFileSystem.java:1043)
   	at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
   	at org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:1053)
   	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.visitBulkHFiles(LoadIncrementalHFiles.java:982)
   	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.discoverLoadQueue(LoadIncrementalHFiles.java:940)
   	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.prepareHFileQueue(LoadIncrementalHFiles.java:224)
   	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:331)
   	at org.apache.hadoop.hbase.tool.LoadIncrementalHFiles.doBulkLoad(LoadIncrementalHFiles.java:256)
   	at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:132)
   	at org.apache.seatunnel.spark.hbase.sink.Hbase.output(Hbase.scala:41)
   	at org.apache.seatunnel.spark.SparkEnvironment.sinkProcess(SparkEnvironment.java:179)
   	at org.apache.seatunnel.spark.batch.SparkBatchExecution.start(SparkBatchExecution.java:54)
   	at org.apache.seatunnel.core.spark.command.SparkTaskExecuteCommand.execute(SparkTaskExecuteCommand.java:67)
   ```
   
   
   ### Flink or Spark Version
   
   _No response_
   
   ### Java or Scala Version
   
   _No response_
   
   ### Screenshots
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [incubator-seatunnel] TyrantLucifer closed issue #3168: [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded

Posted by GitBox <gi...@apache.org>.

TyrantLucifer closed issue #3168: [Bug][Connector-V1-Spark-Hbase] When the written df is empty, the directory does not exist when the load file is loaded
URL: https://github.com/apache/incubator-seatunnel/issues/3168


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org