You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by GitBox <gi...@apache.org> on 2022/04/22 02:31:50 UTC
[GitHub] [incubator-kyuubi] bestbugwriter opened a new issue, #2439: [Bug] kyuubi-tpcds will failed with "error=26, Text file busy"
bestbugwriter opened a new issue, #2439:
URL: https://github.com/apache/incubator-kyuubi/issues/2439
### Code of Conduct
- [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
### Search before asking
- [X] I have searched in the [issues](https://github.com/apache/incubator-kyuubi/issues?q=is%3Aissue) and found no similar issues.
### Describe the bug
I want to use incubator-kyuubi/dev/kyuubi-tpcds to generate test datas. but I got an exception:
```
Caused by: java.io.IOException: Cannot run program "./dsdgen" (in directory "/tmp/spark-fee2eca2-caae-476a-8b53-2892ecb1045a/tpcds-8e81c7be-86c3-4e7c-a186-f31edee05210"): error=26, Text file busy
```
This problem is inevitable.
I refer to this link for operation https://github.com/apache/incubator-kyuubi/tree/master/dev/kyuubi-tpcds
my command:
```
$SPARK_HOME/bin/spark-submit --driver-memory 16G --class org.apache.kyuubi.tpcds.DataGenerator /home/lpz/incubator-kyuubi/dev/kyuubi-tpcds/target/kyuubi-tpcds_2.12-1.6.0-SNAPSHOT.jar --db tpcds_test1 --scaleFactor 1 --format parquet --parallel 1
```
my hardware:
```
cpu: i5-8500
memory: 24G ddr4 2666
```
my os:
```
Linux U 5.13.0-40-generic #45~20.04.1-Ubuntu SMP Mon Apr 4 09:38:31 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
```
my java version:
```
openjdk version "1.8.0_312"
OpenJDK Runtime Environment (build 1.8.0_312-8u312-b07-0ubuntu1~20.04-b07)
OpenJDK 64-Bit Server VM (build 25.312-b07, mixed mode
```
my spark version:
```
spark-3.2.1-bin-hadoop3.2
```
### Affects Version(s)
master
### Kyuubi Server Log Output
```logtalk
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2454)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2403)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2402)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2402)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1160)
at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1160)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1160)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2642)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2584)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2573)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
Caused by: java.io.IOException: Cannot run program "./dsdgen" (in directory "/tmp/spark-fee2eca2-caae-476a-8b53-2892ecb1045a/tpcds-8e81c7be-86c3-4e7c-a186-f31edee05210"): error=26, Text file busy
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
at org.apache.kyuubi.tpcds.TableGenerator.$anonfun$toDF$1(TableGenerator.scala:100)
at org.apache.kyuubi.tpcds.TableGenerator.$anonfun$toDF$1$adapted(TableGenerator.scala:57)
at scala.collection.Iterator$$anon$11.nextCur(Iterator.scala:486)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:492)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759)
at org.apache.spark.sql.execution.columnar.DefaultCachedBatchSerializer$$anon$1.hasNext(InMemoryRelation.scala:118)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
at org.apache.spark.storage.memory.MemoryStore.putIterator(MemoryStore.scala:223)
at org.apache.spark.storage.memory.MemoryStore.putIteratorAsValues(MemoryStore.scala:302)
at org.apache.spark.storage.BlockManager.$anonfun$doPutIterator$1(BlockManager.scala:1481)
at org.apache.spark.storage.BlockManager.org$apache$spark$storage$BlockManager$$doPut(BlockManager.scala:1408)
at org.apache.spark.storage.BlockManager.doPutIterator(BlockManager.scala:1472)
at org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:1295)
at org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:384)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:335)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:52)
at org.apache.spark.scheduler.Task.run(Task.scala:131)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: error=26, Text file busy
at java.lang.UNIXProcess.forkAndExec(Native Method)
at java.lang.UNIXProcess.<init>(UNIXProcess.java:247)
at java.lang.ProcessImpl.start(ProcessImpl.java:134)
at java.lang.ProcessBuilder.start(ProcessBuilder.java:1029)
... 44 more
22/04/22 09:57:27 INFO SparkContext: Invoking stop() from shutdown hook
22/04/22 09:57:27 INFO SparkUI: Stopped Spark web UI at http://npc:4040
22/04/22 09:57:28 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
22/04/22 09:57:28 INFO MemoryStore: MemoryStore cleared
22/04/22 09:57:28 INFO BlockManager: BlockManager stopped
22/04/22 09:57:28 INFO BlockManagerMaster: BlockManagerMaster stopped
22/04/22 09:57:28 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
22/04/22 09:57:28 INFO SparkContext: Successfully stopped SparkContext
22/04/22 09:57:28 INFO ShutdownHookManager: Shutdown hook called
22/04/22 09:57:28 INFO ShutdownHookManager: Deleting directory
```
### Kyuubi Engine Log Output
_No response_
### Kyuubi Server Configurations
_No response_
### Kyuubi Engine Configurations
_No response_
### Additional context
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [incubator-kyuubi] pan3793 commented on issue #2439: [Bug] kyuubi-tpcds will failed with "error=26, Text file busy"
Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #2439:
URL: https://github.com/apache/incubator-kyuubi/issues/2439#issuecomment-1107398660
Maybe we can migrate the current tpcds tool to https://github.com/trinodb/tpcds, it's pure Java and under Apache License
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [incubator-kyuubi] yaooqinn closed issue #2439: [Bug] kyuubi-tpcds will failed with "error=26, Text file busy"
Posted by GitBox <gi...@apache.org>.
yaooqinn closed issue #2439: [Bug] kyuubi-tpcds will failed with "error=26, Text file busy"
URL: https://github.com/apache/incubator-kyuubi/issues/2439
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [incubator-kyuubi] pan3793 commented on issue #2439: [Bug] kyuubi-tpcds will failed with "error=26, Text file busy"
Posted by GitBox <gi...@apache.org>.
pan3793 commented on issue #2439:
URL: https://github.com/apache/incubator-kyuubi/issues/2439#issuecomment-1189796225
Some deep analysis in the Hadoop community https://issues.apache.org/jira/browse/MAPREDUCE-2374
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org
[GitHub] [incubator-kyuubi] github-actions[bot] commented on issue #2439: [Bug] kyuubi-tpcds will failed with "error=26, Text file busy"
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on issue #2439:
URL: https://github.com/apache/incubator-kyuubi/issues/2439#issuecomment-1105942581
Hello @bestbugwriter,
Thanks for finding the time to report the issue!
We really appreciate the community's efforts to improve Apache Kyuubi (Incubating).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org