You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@kudu.apache.org by Darren Hoo <da...@gmail.com> on 2016/03/14 16:20:10 UTC

sparkContext won't stop when using spark-kudu

I use sqlContext to register the kudu table

sqlContext.read
  .format("org.kududb.spark")
  .options(Map("kudu.table" -> table, "kudu.master" -> kuduMaster))
  .load()
  .registerTempTable(table)

then do something query and processing

     sqlContext.sql("...")


but after sc.stop() is called, the spark driver never exit:

16/03/14 22:54:51 INFO DAGScheduler: Stopping DAGScheduler
16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Shutting down all executors
16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Interrupting monitor thread
16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Asking each
executor to shut down
16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Stopped
16/03/14 22:54:51 INFO MapOutputTrackerMasterEndpoint:
MapOutputTrackerMasterEndpoint stopped!
16/03/14 22:54:51 INFO MemoryStore: MemoryStore cleared
16/03/14 22:54:51 INFO BlockManager: BlockManager stopped
16/03/14 22:54:51 INFO BlockManagerMaster: BlockManagerMaster stopped
16/03/14 22:54:51 INFO SparkContext: Successfully stopped SparkContext
16/03/14 22:54:51 INFO
OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
OutputCommitCoordinator stopped!
16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
Shutting down remote daemon.
16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
Remote daemon shut down; proceeding with flushing remote transports.
16/03/14 22:54:51 INFO Remoting: Remoting shut down
16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
Remoting shut down.

then it stuck here forever

PS: I use spark on yarn client mode, the problem only occurs when I
use kudu-spark

I have the thread dump about 7k after gzipped, I can post here if asked.

Re: sparkContext won't stop when using spark-kudu

Posted by Dan Burkert <da...@cloudera.com>.
Hi Darren,

I found the culprit, and I've put up a patch here
<http://gerrit.cloudera.org:8080/#/c/2571/>.  Should make it into the next
release (0.8.0).  Until then stopping the shell with the 'exit' command or
<ctrl>-C should do the trick.

- Dan

On Tue, Mar 15, 2016 at 12:04 PM, Dan Burkert <da...@cloudera.com> wrote:

> Hi Darren,
>
> I was able to repro locally.  I think our connector is not implementing
> some shutdown hooks.  I'm going to track down a Spark expert and figure out
> exactly what we should be doing to have a more graceful shutdown.
>
> - Dan
>

Re: sparkContext won't stop when using spark-kudu

Posted by Dan Burkert <da...@cloudera.com>.
Hi Darren,

I was able to repro locally.  I think our connector is not implementing
some shutdown hooks.  I'm going to track down a Spark expert and figure out
exactly what we should be doing to have a more graceful shutdown.

- Dan

Re: sparkContext won't stop when using spark-kudu

Posted by Dan Burkert <da...@cloudera.com>.
Hi Darren,

That does look like it's the Kudu client which is preventing shutdown.  I'm
going to try and reproduce today, I'll let you know what I find.  Thanks
for the report!

- Dan

On Mon, Mar 14, 2016 at 9:00 PM, Darren Hoo <da...@gmail.com> wrote:

> Hi Dan,
>
> Here's my enviroment:
>
> CDH Version:  5.5.0-1.cdh5.5.0.p0.8
> Kudu Version:  0.7.1-1.kudu0.7.1.p0.36
>
> Steps to reproduce:
>
>
> *1. create the kudu table:*
>
> CREATE TABLE t1 (
>   id bigint
> )
> TBLPROPERTIES(
>   'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
>   'kudu.table_name' = 't1',
>   'kudu.master_addresses' = 'master1:7051,master2:7051',
>   'kudu.key_columns' = 'id',
>   'kudu.num_tablet_replicas' = '5'
> )
>
> *2. insert some values:*
>
> insert into t1 values (1),(2),(3),(4),(5);
>
> *3. start the spark-shell*
>
> $ spark-shell --jars
> lib/interface-annotations-0.7.1.jar,lib/kudu-client-0.7.1.jar,lib/kudu-mapreduce-0.7.1.jar,lib/kudu-spark-0.7.1.jar
>
>
> >  sqlContext.read
>      .format("org.kududb.spark")
>      .options(Map("kudu.table" -> "t1", "kudu.master" ->
> "master1:7051,master2:7051"))
>      .load()
>      .registerTempTable("t1")
>
> >  sqlContext.sql("select id from t1").count
>
>
> *4. exit spark-shell with Cltrl-D*
>
> > Ctrl-D
>
> when the spark-shell is shutting down, finally it shows:
> 16/03/15 11:48:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting
> shut down.
>
> then the process hangs for ever until Ctrl-C is pressed.
>
> I don't have to do cleanup for sqlContext manually, right?
>
> the stack dump is attached.
>
>
> On Tue, Mar 15, 2016 at 9:56 AM, Dan Burkert <da...@cloudera.com> wrote:
>
>> Hi Darren,
>>
>> I think the thread dump would be helpful.  We have a very similar test in
>> the repository, and we haven't had any problems with that.  What
>> environment are you running the job in?
>>
>> - Dan
>>
>> On Mon, Mar 14, 2016 at 8:20 AM, Darren Hoo <da...@gmail.com> wrote:
>>
>>> I use sqlContext to register the kudu table
>>>
>>> sqlContext.read
>>>   .format("org.kududb.spark")
>>>   .options(Map("kudu.table" -> table, "kudu.master" -> kuduMaster))
>>>   .load()
>>>   .registerTempTable(table)
>>>
>>> then do something query and processing
>>>
>>>      sqlContext.sql("...")
>>>
>>>
>>> but after sc.stop() is called, the spark driver never exit:
>>>
>>> 16/03/14 22:54:51 INFO DAGScheduler: Stopping DAGScheduler
>>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Shutting down all
>>> executors
>>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Interrupting monitor
>>> thread
>>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Asking each
>>> executor to shut down
>>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Stopped
>>> 16/03/14 22:54:51 INFO MapOutputTrackerMasterEndpoint:
>>> MapOutputTrackerMasterEndpoint stopped!
>>> 16/03/14 22:54:51 INFO MemoryStore: MemoryStore cleared
>>> 16/03/14 22:54:51 INFO BlockManager: BlockManager stopped
>>> 16/03/14 22:54:51 INFO BlockManagerMaster: BlockManagerMaster stopped
>>> 16/03/14 22:54:51 INFO SparkContext: Successfully stopped SparkContext
>>> 16/03/14 22:54:51 INFO
>>> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>>> OutputCommitCoordinator stopped!
>>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>>> Shutting down remote daemon.
>>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>>> Remote daemon shut down; proceeding with flushing remote transports.
>>> 16/03/14 22:54:51 INFO Remoting: Remoting shut down
>>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>>> Remoting shut down.
>>>
>>> then it stuck here forever
>>>
>>> PS: I use spark on yarn client mode, the problem only occurs when I
>>> use kudu-spark
>>>
>>> I have the thread dump about 7k after gzipped, I can post here if asked.
>>>
>>
>>
>

Re: sparkContext won't stop when using spark-kudu

Posted by Darren Hoo <da...@gmail.com>.
Hi Dan,

Here's my enviroment:

CDH Version:  5.5.0-1.cdh5.5.0.p0.8
Kudu Version:  0.7.1-1.kudu0.7.1.p0.36

Steps to reproduce:


*1. create the kudu table:*

CREATE TABLE t1 (
  id bigint
)
TBLPROPERTIES(
  'storage_handler' = 'com.cloudera.kudu.hive.KuduStorageHandler',
  'kudu.table_name' = 't1',
  'kudu.master_addresses' = 'master1:7051,master2:7051',
  'kudu.key_columns' = 'id',
  'kudu.num_tablet_replicas' = '5'
)

*2. insert some values:*

insert into t1 values (1),(2),(3),(4),(5);

*3. start the spark-shell*

$ spark-shell --jars
lib/interface-annotations-0.7.1.jar,lib/kudu-client-0.7.1.jar,lib/kudu-mapreduce-0.7.1.jar,lib/kudu-spark-0.7.1.jar


>  sqlContext.read
     .format("org.kududb.spark")
     .options(Map("kudu.table" -> "t1", "kudu.master" ->
"master1:7051,master2:7051"))
     .load()
     .registerTempTable("t1")

>  sqlContext.sql("select id from t1").count


*4. exit spark-shell with Cltrl-D*

> Ctrl-D

when the spark-shell is shutting down, finally it shows:
16/03/15 11:48:17 INFO RemoteActorRefProvider$RemotingTerminator: Remoting
shut down.

then the process hangs for ever until Ctrl-C is pressed.

I don't have to do cleanup for sqlContext manually, right?

the stack dump is attached.


On Tue, Mar 15, 2016 at 9:56 AM, Dan Burkert <da...@cloudera.com> wrote:

> Hi Darren,
>
> I think the thread dump would be helpful.  We have a very similar test in
> the repository, and we haven't had any problems with that.  What
> environment are you running the job in?
>
> - Dan
>
> On Mon, Mar 14, 2016 at 8:20 AM, Darren Hoo <da...@gmail.com> wrote:
>
>> I use sqlContext to register the kudu table
>>
>> sqlContext.read
>>   .format("org.kududb.spark")
>>   .options(Map("kudu.table" -> table, "kudu.master" -> kuduMaster))
>>   .load()
>>   .registerTempTable(table)
>>
>> then do something query and processing
>>
>>      sqlContext.sql("...")
>>
>>
>> but after sc.stop() is called, the spark driver never exit:
>>
>> 16/03/14 22:54:51 INFO DAGScheduler: Stopping DAGScheduler
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Shutting down all
>> executors
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Interrupting monitor
>> thread
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Asking each
>> executor to shut down
>> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Stopped
>> 16/03/14 22:54:51 INFO MapOutputTrackerMasterEndpoint:
>> MapOutputTrackerMasterEndpoint stopped!
>> 16/03/14 22:54:51 INFO MemoryStore: MemoryStore cleared
>> 16/03/14 22:54:51 INFO BlockManager: BlockManager stopped
>> 16/03/14 22:54:51 INFO BlockManagerMaster: BlockManagerMaster stopped
>> 16/03/14 22:54:51 INFO SparkContext: Successfully stopped SparkContext
>> 16/03/14 22:54:51 INFO
>> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
>> OutputCommitCoordinator stopped!
>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>> Shutting down remote daemon.
>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>> Remote daemon shut down; proceeding with flushing remote transports.
>> 16/03/14 22:54:51 INFO Remoting: Remoting shut down
>> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
>> Remoting shut down.
>>
>> then it stuck here forever
>>
>> PS: I use spark on yarn client mode, the problem only occurs when I
>> use kudu-spark
>>
>> I have the thread dump about 7k after gzipped, I can post here if asked.
>>
>
>

Re: sparkContext won't stop when using spark-kudu

Posted by Dan Burkert <da...@cloudera.com>.
Hi Darren,

I think the thread dump would be helpful.  We have a very similar test in
the repository, and we haven't had any problems with that.  What
environment are you running the job in?

- Dan

On Mon, Mar 14, 2016 at 8:20 AM, Darren Hoo <da...@gmail.com> wrote:

> I use sqlContext to register the kudu table
>
> sqlContext.read
>   .format("org.kududb.spark")
>   .options(Map("kudu.table" -> table, "kudu.master" -> kuduMaster))
>   .load()
>   .registerTempTable(table)
>
> then do something query and processing
>
>      sqlContext.sql("...")
>
>
> but after sc.stop() is called, the spark driver never exit:
>
> 16/03/14 22:54:51 INFO DAGScheduler: Stopping DAGScheduler
> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Shutting down all
> executors
> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Interrupting monitor
> thread
> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Asking each
> executor to shut down
> 16/03/14 22:54:51 INFO YarnClientSchedulerBackend: Stopped
> 16/03/14 22:54:51 INFO MapOutputTrackerMasterEndpoint:
> MapOutputTrackerMasterEndpoint stopped!
> 16/03/14 22:54:51 INFO MemoryStore: MemoryStore cleared
> 16/03/14 22:54:51 INFO BlockManager: BlockManager stopped
> 16/03/14 22:54:51 INFO BlockManagerMaster: BlockManagerMaster stopped
> 16/03/14 22:54:51 INFO SparkContext: Successfully stopped SparkContext
> 16/03/14 22:54:51 INFO
> OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:
> OutputCommitCoordinator stopped!
> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
> Shutting down remote daemon.
> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
> Remote daemon shut down; proceeding with flushing remote transports.
> 16/03/14 22:54:51 INFO Remoting: Remoting shut down
> 16/03/14 22:54:51 INFO RemoteActorRefProvider$RemotingTerminator:
> Remoting shut down.
>
> then it stuck here forever
>
> PS: I use spark on yarn client mode, the problem only occurs when I
> use kudu-spark
>
> I have the thread dump about 7k after gzipped, I can post here if asked.
>