You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2017/06/19 12:37:00 UTC
[jira] [Commented] (SPARK-21139)
java.util.concurrent.RejectedExecutionException: rejected from
java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0,
active threads = 0, queued tasks = 0, completed tasks = 14109]
[ https://issues.apache.org/jira/browse/SPARK-21139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16053911#comment-16053911 ]
Sean Owen commented on SPARK-21139:
-----------------------------------
That looks like an issue from the HBase client, not Spark.
> java.util.concurrent.RejectedExecutionException: rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-21139
> URL: https://issues.apache.org/jira/browse/SPARK-21139
> Project: Spark
> Issue Type: Bug
> Components: Input/Output
> Affects Versions: 1.5.2
> Environment: use spark1.5.2 and hbase 1.1.2
> Reporter: shining
>
> We create two tables use Hive HBaseStorageHandler like:
> CREATE EXTERNAL TABLE `yx_bw`(
> `rowkey` string,
> `occur_time` string,
> `milli_second` string,
> `yx_id` string ,
> `resp_area` string ,
> `st_id` string,
> `bay_id` string,
> `device_type_id` string,
> `content` string,
> ......)
> ROW FORMAT SERDE
> 'org.apache.hadoop.hive.hbase.HBaseSerDe'
> STORED BY
> 'org.apache.hadoop.hive.hbase.HBaseStorageHandler'
> WITH SERDEPROPERTIES (
> 'hbase.columns.mapping'=':key,f:OCCUR_TIME,f:MILLI_SECOND,f:YX_ID,f:RESP_AREA,f:ST_ID,f:BAY_ID,f:STATUS,f:CONTENT,f:VLTY_ID,f:MEAS_TYPE,f:RESTRAIN_FLAG,f:R
> ESERV_INT1,f:RESERV_INT2,f:CUSTOMIZED_GROUP,f:CONFIRM_STATUS,f:CONFIRM_TIME,f:CONFIRM_USER_ID,f:CONFIRM_NODE_ID,f:IF_DISPLAY', 'serialization.format'='1')
> TBLPROPERTIES (
> 'hbase.table.name'='yx_bw'')
> Then we use sparksql to run a join between two tables.
> select * from xxgljxb a, yx_bw b where a.YX_ID = b.YX_ID;
> When scan hbase table, we encounter the issue:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 (TID 3, localhost): java.lang.RuntimeException: java.util.concurrent. : Task org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFutureQueueingFuture@37b2d978 rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
> at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
> at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
> at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
> at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:205)
> at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147)
> at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216)
> at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:156)
> at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:114)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)
> at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)
> at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:118)
> at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> at org.apache.spark.scheduler.Task.run(Task.scala:88)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@37b2d978 rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
> at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
> at org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
> at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
> at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
> at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
> at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> ... 27 more
> Driver stacktrace:
> at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
> at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
> at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
> at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
> at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> at scala.Option.foreach(Option.scala:236)
> at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
> at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
> at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
> at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
> at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
> at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
> at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405)
> at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> at py4j.Gateway.invoke(Gateway.java:259)
> at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> at py4j.commands.CallCommand.execute(CallCommand.java:79)
> at py4j.GatewayConnection.run(GatewayConnection.java:207)
> at java.lang.Thread.run(Thread.java:745)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org