You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@spark.apache.org by canan chen <cc...@gmail.com> on 2015/10/14 04:50:47 UTC
When does python program started in pyspark
I look at the source code of spark, but didn't find where python program is
started in python.
It seems spark-submit will call PythonGatewayServer, but where is python
program started ?
Thanks
Re: When does python program started in pyspark
Posted by canan chen <cc...@gmail.com>.
I think PythonRunner is launched when executing python script.
PythonGatewayServer is entry point for python spark shell
if (args.isPython && deployMode == CLIENT) {
if (args.primaryResource == PYSPARK_SHELL) {
args.mainClass = "org.apache.spark.api.python.PythonGatewayServer"
} else {
// If a python file is provided, add it to the child arguments and
list of files to deploy.
// Usage: PythonAppRunner <main python file> <extra python files>
[app arguments]
args.mainClass = "org.apache.spark.deploy.PythonRunner"
args.childArgs = ArrayBuffer(args.primaryResource, args.pyFiles)
++ args.childArgs
if (clusterManager != YARN) {
// The YARN backend distributes the primary file differently, so
don't merge it.
args.files = mergeFileLists(args.files, args.primaryResource)
}
}
On Wed, Oct 14, 2015 at 12:46 PM, skaarthik oss <sk...@gmail.com>
wrote:
> See PythonRunner @
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala
>
> On Tue, Oct 13, 2015 at 7:50 PM, canan chen <cc...@gmail.com> wrote:
>
>> I look at the source code of spark, but didn't find where python program
>> is started in python.
>>
>> It seems spark-submit will call PythonGatewayServer, but where is python
>> program started ?
>>
>> Thanks
>>
>
>
Re: When does python program started in pyspark
Posted by canan chen <cc...@gmail.com>.
I think PythonRunner is launched when executing python script.
PythonGatewayServer is entry point for python spark shell
if (args.isPython && deployMode == CLIENT) {
if (args.primaryResource == PYSPARK_SHELL) {
args.mainClass = "org.apache.spark.api.python.PythonGatewayServer"
} else {
// If a python file is provided, add it to the child arguments and
list of files to deploy.
// Usage: PythonAppRunner <main python file> <extra python files>
[app arguments]
args.mainClass = "org.apache.spark.deploy.PythonRunner"
args.childArgs = ArrayBuffer(args.primaryResource, args.pyFiles)
++ args.childArgs
if (clusterManager != YARN) {
// The YARN backend distributes the primary file differently, so
don't merge it.
args.files = mergeFileLists(args.files, args.primaryResource)
}
}
On Wed, Oct 14, 2015 at 12:46 PM, skaarthik oss <sk...@gmail.com>
wrote:
> See PythonRunner @
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala
>
> On Tue, Oct 13, 2015 at 7:50 PM, canan chen <cc...@gmail.com> wrote:
>
>> I look at the source code of spark, but didn't find where python program
>> is started in python.
>>
>> It seems spark-submit will call PythonGatewayServer, but where is python
>> program started ?
>>
>> Thanks
>>
>
>
Re: When does python program started in pyspark
Posted by skaarthik oss <sk...@gmail.com>.
See PythonRunner @
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/PythonRunner.scala
On Tue, Oct 13, 2015 at 7:50 PM, canan chen <cc...@gmail.com> wrote:
> I look at the source code of spark, but didn't find where python program
> is started in python.
>
> It seems spark-submit will call PythonGatewayServer, but where is python
> program started ?
>
> Thanks
>