You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "shining (JIRA)" <ji...@apache.org> on 2017/06/24 01:51:01 UTC

[jira] [Closed] (SPARK-21139) java.util.concurrent.RejectedExecutionException: rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]

     [ https://issues.apache.org/jira/browse/SPARK-21139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

shining closed SPARK-21139.
---------------------------

Maybe HiveHBaseInputFormat use the old API of HBase. It is not spark's issue

> java.util.concurrent.RejectedExecutionException: rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21139
>                 URL: https://issues.apache.org/jira/browse/SPARK-21139
>             Project: Spark
>          Issue Type: Bug
>          Components: Input/Output
>    Affects Versions: 1.5.2
>         Environment: use spark1.5.2 and hbase 1.1.2
>            Reporter: shining
>
> We create two tables use Hive HBaseStorageHandler like:
> CREATE EXTERNAL TABLE `yx_bw`(
>   `rowkey` string, 
>   `occur_time` string, 
>   `milli_second` string, 
>   `yx_id` string , 
>   `resp_area` string , 
>   `st_id` string, 
>   `bay_id` string, 
>   `device_type_id` string, 
>   `content` string, 
>     ......)
> ROW FORMAT SERDE 
>   'org.apache.hadoop.hive.hbase.HBaseSerDe' 
> STORED BY 
>   'org.apache.hadoop.hive.hbase.HBaseStorageHandler' 
> WITH SERDEPROPERTIES ( 
>  'hbase.columns.mapping'=':key,f:OCCUR_TIME,f:MILLI_SECOND,f:YX_ID,f:RESP_AREA,f:ST_ID,f:BAY_ID,f:STATUS,f:CONTENT,f:VLTY_ID,f:MEAS_TYPE,f:RESTRAIN_FLAG,f:R
> ESERV_INT1,f:RESERV_INT2,f:CUSTOMIZED_GROUP,f:CONFIRM_STATUS,f:CONFIRM_TIME,f:CONFIRM_USER_ID,f:CONFIRM_NODE_ID,f:IF_DISPLAY',   'serialization.format'='1')
> TBLPROPERTIES (
>   'hbase.table.name'='yx_bw'')
> Then we use sparksql to run a join between two tables.
> select * from xxgljxb a, yx_bw b where a.YX_ID = b.YX_ID; 
> When scan hbase table, we encounter the issue:
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 1.0 failed 1 times, most recent failure: Lost task 2.0 in stage 1.0 (TID 3, localhost): java.lang.RuntimeException: java.util.concurrent.                                                  : Task org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFutureQueueingFuture@37b2d978 rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208)
> 	at org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320)
> 	at org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403)
> 	at org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364)
> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl.nextKeyValue(TableRecordReaderImpl.java:205)
> 	at org.apache.hadoop.hbase.mapreduce.TableRecordReader.nextKeyValue(TableRecordReader.java:147)
> 	at org.apache.hadoop.hbase.mapreduce.TableInputFormatBase$1.nextKeyValue(TableInputFormatBase.java:216)
> 	at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:156)
> 	at org.apache.hadoop.hive.hbase.HiveHBaseTableInputFormat$1.next(HiveHBaseTableInputFormat.java:114)
> 	at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:248)
> 	at org.apache.spark.rdd.HadoopRDD$$anon$1.getNext(HadoopRDD.scala:216)
> 	at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73)
> 	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:39)
> 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> 	at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:388)
> 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> 	at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:327)
> 	at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.insertAll(BypassMergeSortShuffleWriter.java:118)
> 	at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:73)
> 	at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
> 	at org.apache.spark.scheduler.Task.run(Task.scala:88)
> 	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
> 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> 	at java.lang.Thread.run(Thread.java:745)
> Caused by: java.util.concurrent.RejectedExecutionException: Task org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@37b2d978 rejected from java.util.concurrent.ThreadPoolExecutor@46477dd0[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 14109]
> 	at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2048)
> 	at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:821)
> 	at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1372)
> 	at org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142)
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269)
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165)
> 	at org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59)
> 	at org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200)
> 	... 27 more
> Driver stacktrace:
> 	at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1283)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1271)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1270)
> 	at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> 	at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> 	at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1270)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> 	at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
> 	at scala.Option.foreach(Option.scala:236)
> 	at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1496)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
> 	at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1447)
> 	at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
> 	at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
> 	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
> 	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
> 	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
> 	at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
> 	at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
> 	at org.apache.spark.api.python.PythonRDD$.collectAndServe(PythonRDD.scala:405)
> 	at org.apache.spark.api.python.PythonRDD.collectAndServe(PythonRDD.scala)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:606)
> 	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
> 	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379)
> 	at py4j.Gateway.invoke(Gateway.java:259)
> 	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
> 	at py4j.commands.CallCommand.execute(CallCommand.java:79)
> 	at py4j.GatewayConnection.run(GatewayConnection.java:207)
> 	at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org