You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@livy.apache.org by Junaid Nasir <jn...@an10.io> on 2018/01/11 10:22:37 UTC

Session taking all the available resources even with number of cores specified

Hi everyone,

I am using livy 0.4 with Spark 2.1.0 standalone cluster.I can create sessions
and run jobs. but 1 session takes up all the available resources. I have tried
setting up executorCores, numExecutors as well as spark.total.executor.cores.
this command works fine when running a session from cmd line  
                  ./spark-2.1.0/bin/pyspark --master spark://master:7077  --executor-cores 2  --num-executors 1 --total-executor-cores 4
                

Not using Mixmax yet?  
post request on livy:8998/session

                  {    "kind": "pyspark",    "proxyUser": "root",    "conf": {        "spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",        "spark.executor.cores": 2,        "spark.total.executor.cores": 2,        "livy.spark.driver.cores": 2,        "livy.spark.executor.cores": 2,        "livy.spark.executor.instances": 1    },    "executorMemory": "1G",    "executorCores": 2,    "numExecutors": 1,    "driverCores": 1,    "driverMemory": "1G"}
                

Not using Mixmax yet?  

Is there any configuration I can do to limit the cores, so that I can run
multiple sessions on same cluster?
RegardsJunaid

Re: Session taking all the available resources even with number of cores specified

Posted by Junaid Nasir <jn...@an10.io>.
On create an issue popup I am unable to select Livy from project's dropdown,
there are other projects which I can select but not livy.  





On Wed, Jan 31, 2018 3:04 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:
I created the JIRA for you (https://issues.apache.org/jira/browse/LIVY-437, but
what exactly stopped you from creating a JIRA? We don't want this to be an
problem in the future for other users trying to report issues.


Alex Bozarth
Software Engineer
Spark Technology Center  
--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States


Junaid Nasir ---01/30/2018 03:44:30 AM---Hello everyone, I don't have access to
create a JIRA issue on livy project. can someone please

From: Junaid Nasir <jn...@an10.io>
To: user@livy.incubator.apache.org
Date: 01/30/2018 03:44 AM
Subject: Re: Session taking all the available resources even with number of
cores specified



--------------------------------------------------------------------------------



Hello everyone,

I don't have access to create a JIRA issue on livy project. can someone please
create a ticket around this?

Issue: Livy doesn't support "total-executor-cores" settings for spark standalone
cluster
This will enable livy to run multiple spark executors on standalone cluster.

Thanks,
Junaid  



On Mon, Jan 15, 2018 12:22 PM, Junaid Nasir jnasir@an10.io  wrote: Hi Alex,

Thank you for taking your time to investigate this. Will create a jira ticket
and try to help with PR.   * 
 * On Sat, Jan 13, 2018 1:49 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:
   Hey Junaid,
   
   I did some digging in both Livy and Spark code and you are correct about the
   issue being with total-executor-nodes. There is no such conf as
   "spark.total.executor.nodes" in Spark as it only exists as a param to
   spark-submit (which Spark then maps to the conf "spark.cores.max"). Also
   currently Livy does not support that param when constructing it's
   spark-submit, which would be useful and I would recommend opening a JIRA to
   address it.
   
   I can also point out a few issues with you POST for future reference while
   I'm at it:
   - The three "livy.spark.*" confs in your POST don't exist and would not be
   used, also the value of "executorCores" is mapped to the conf
   "spark.executor.cores" making it redundant
   - Similarly "numExecutors" and "driverCores" map to
   "spark.executor.instances" and "spark.driver.cores" respectively, which I
   believe you tried setting with "livy.spark.*" formatted confs
   
   
   So in summary you should open up a JIRA for Livy to support the
   "total-executor-nodes" param for spark-submit and for now you can see if
   setting "spark.cores.max" in the conf works as a work-around (though I'm not
   hopeful)
   
   Alex Bozarth
   Software Engineer
   Spark Technology Center


--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States   * 
 * 
 * 
 * 
 * Junaid Nasir ---01/11/2018 09:54:57 PM---Hi Alex,It is working
   correctly when i create spark session using bin/pyspark. I can create
   multiple
 * 
 * 
   From: 
 * Junaid Nasir <jn...@an10.io>
 * 
   To: 
 * user@livy.incubator.apache.org
 * 
   Date: 
 * 01/11/2018 09:54 PM
 * 
   Subject: 
 * Re: Session taking all the available resources even with
   number of cores specified
 * 
 * 
 * 
 * Hi Alex,
   It is working correctly when i create spark session using bin/pyspark. I can
   create multiple sessions as well. command which enable this is
   "--total-executor-cores 4". I think livy is not passing it to Spark?
   
   We are using DCOS and livy doesn't work with spark mesos deployment, so had
   to setup standalone spark cluster.
 * 
 * ajbozart@us.ibm.com
 * 
 *  * Hi Junaid,
      
      From my experience this is an issue with the Spark stand-alone cluster,
      which is why Livy is recommended to run with YARN instead, which should
      allocate resources properly. @Jerry, you have seen more production uses of
      LIvy than me, am I correct that Livy with a stand-alone Spark cluster
      can't handle multiple sessions?
    * 
    * Alex Bozarth
      Software Engineer
      Spark Technology Center
   
   


--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States   *  * 
    * 
    * 
    * 
    * Junaid Nasir ---01/11/2018 02:22:55 AM---Hi everyone, I am
      using livy 0.4 with Spark 2.1.0 standalone cluster.I can create sessions
    * 
      
      From: 
    * Junaid Nasir <jn...@an10.io>
    * 
      To: 
    * user@livy.incubator.apache.org
    * 
      Date: 
    * 01/11/2018 02:22 AM
    * 
      Subject: 
    * Session taking all the available resources even with number
      of cores specified
    * 
    * 
    * Hi everyone,
      
      I am using livy 0.4 with Spark 2.1.0 standalone cluster.
      I can create sessions and run jobs. but 1 session takes up all the
      available resources. I have tried setting up executorCores, numExecutors
      as well as spark.total.executor.cores. this command works fine when
      running a session from cmd line
      
      
      
      
      
      ./spark-2.1.0/bin/pyspark --master spark://master:7077  --executor-cores 2 
       --num-executors 1  --total-executor-cores 4
      Not using Mixmax yet?  
      post request on livy:8998/session
      
      
      
      
      
      
      {
      "kind": "pyspark",
      "proxyUser": "root",
      "conf": {
      "spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",
      "spark.executor.cores": 2,
      "spark.total.executor.cores": 2,
      "livy.spark.driver.cores": 2,
      "livy.spark.executor.cores": 2,
      "livy.spark.executor.instances": 1
      },
      "executorMemory": "1G",
      "executorCores": 2,
      "numExecutors": 1,
      "driverCores": 1,
      "driverMemory": "1G"
      }
      Not using Mixmax yet?  
      
      Is there any configuration I can do to limit the cores, so that I can run
      multiple sessions on same cluster?
      
      Regards
      Junaid
    * 
    * 
    * 
   
   
 * 
 *

Re: Session taking all the available resources even with number of cores specified

Posted by Alex Bozarth <aj...@us.ibm.com>.
I created the JIRA for you (https://issues.apache.org/jira/browse/LIVY-437,
but what exactly stopped you from creating a JIRA? We don't want this to be
an problem in the future for other users trying to report issues.

                                                                                   
 Alex Bozarth                                                                      
 Software Engineer                                                                 
 Spark Technology Center                                                           
                                                                                   
                                                                                     
                                                                                     
                                                                                     
 E-mail: ajbozart@us.ibm.com                                                         
 GitHub: github.com/ajbozarth                                                        
                                                                   505 Howard Street 
                                                             San Francisco, CA 94105 
                                                                       United States 
                                                                                     








From:	Junaid Nasir <jn...@an10.io>
To:	user@livy.incubator.apache.org
Date:	01/30/2018 03:44 AM
Subject:	Re: Session taking all the available resources even with number
            of cores specified


                                                                            
 Hello everyone,                                                            
                                                                            
 I don't have access to create a JIRA issue on livy project. can someone    
 please create a ticket around this?                                        
                                                                            
 Issue: Livy doesn't support "total-executor-cores" settings for spark      
 standalone cluster                                                         
 This will enable livy to run multiple spark executors on standalone        
 cluster.                                                                   
                                                                            
 Thanks,                                                                    
 Junaid                                                                     
                                                                            
                                                                            










On Mon, Jan 15, 2018 12:22 PM, Junaid Nasir jnasir@an10.io wrote:
                                                                            
 Hi Alex,                                                                   
                                                                            
 Thank you for taking your time to investigate this. Will create a jira     
 ticket and try to help with PR.                                            
                                                                            
                                                                            









  On Sat, Jan 13, 2018 1:49 AM, Alex Bozarth ajbozart@us.ibm.com wrote:
   Hey Junaid,

   I did some digging in both Livy and Spark code and you are correct about
   the issue being with total-executor-nodes. There is no such conf as
   "spark.total.executor.nodes" in Spark as it only exists as a param to
   spark-submit (which Spark then maps to the conf "spark.cores.max"). Also
   currently Livy does not support that param when constructing it's
   spark-submit, which would be useful and I would recommend opening a JIRA
   to address it.

   I can also point out a few issues with you POST for future reference
   while I'm at it:
   - The three "livy.spark.*" confs in your POST don't exist and would not
   be used, also the value of "executorCores" is mapped to the conf
   "spark.executor.cores" making it redundant
   - Similarly "numExecutors" and "driverCores" map to
   "spark.executor.instances" and "spark.driver.cores" respectively, which
   I believe you tried setting with "livy.spark.*" formatted confs


   So in summary you should open up a JIRA for Livy to support the
   "total-executor-nodes" param for spark-submit and for now you can see if
   setting "spark.cores.max" in the conf works as a work-around (though I'm
   not hopeful)

                                                                                   
 Alex Bozarth                                                                      
 Software Engineer                                                                 
 Spark Technology Center                                                           
                                                                                   




                                                                                    
                                                                                    
                                                                                    
 E-mail: ajbozart@us.ibm.com                                                        
 GitHub: github.com/ajbozarth                                                       
                                                                  505 Howard Street 
                                                            San Francisco, CA 94105 
                                                                      United States 
                                                                                    





   Inactive hide details for Junaid Nasir ---01/11/2018 09:54:57 PM---Hi
   Alex,It is working correctly when i create spark session Junaid Nasir
   ---01/11/2018 09:54:57 PM---Hi Alex,It is working correctly when i
   create spark session using bin/pyspark. I can create multiple

   From: Junaid Nasir <jn...@an10.io>
   To: user@livy.incubator.apache.org
   Date: 01/11/2018 09:54 PM
   Subject: Re: Session taking all the available resources even with number
   of cores specified

                                                                            
 Hi Alex,                                                                   
 It is working correctly when i create spark session using bin/pyspark. I   
 can create multiple sessions as well. command which enable this is         
 "--total-executor-cores 4". I think livy is not passing it to Spark?       
                                                                            
 We are using DCOS and livy doesn't work with spark mesos deployment, so    
 had to setup standalone spark cluster.                                     
                                                                            




   On Fri, Jan 12, 2018 2:57 AM, Alex Bozarth ajbozart@us.ibm.com wrote:
         Hi Junaid,

         From my experience this is an issue with the Spark stand-alone
         cluster, which is why Livy is recommended to run with YARN
         instead, which should allocate resources properly. @Jerry, you
         have seen more production uses of LIvy than me, am I correct that
         Livy with a stand-alone Spark cluster can't handle multiple
         sessions?
                                                                                   
 Alex Bozarth                                                                      
 Software Engineer                                                                 
 Spark Technology Center                                                           
                                                                                   




                                                                                    
                                                                                    
                                                                                    
 E-mail: ajbozart@us.ibm.com                                                        
 GitHub: github.com/ajbozarth                                                       
                                                                  505 Howard Street 
                                                            San Francisco, CA 94105 
                                                                      United States 
                                                                                    





         Inactive hide details for Junaid Nasir ---01/11/2018 02:22:55
         AM---Hi everyone, I am using livy 0.4 with Spark 2.1.0 standalone
         Junaid Nasir ---01/11/2018 02:22:55 AM---Hi everyone, I am using
         livy 0.4 with Spark 2.1.0 standalone cluster.I can create sessions

         From: Junaid Nasir <jn...@an10.io>
         To: user@livy.incubator.apache.org
         Date: 01/11/2018 02:22 AM
         Subject: Session taking all the available resources even with
         number of cores specified
                                                                            
 Hi everyone,                                                               
                                                                            
 I am using livy 0.4 with Spark 2.1.0 standalone cluster.                   
 I can create sessions and run jobs. but 1 session takes up all the         
 available resources. I have tried setting up executorCores, numExecutors   
 as well as spark.total.executor.cores. this command works fine when        
 running a session from cmd line                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 ./spark-2.1.0/bin/pyspark --master spark://master:7077 --executor-cores 2  
 --num-executors 1 --total-executor-cores 4                                 
                                                                            
                                                                            
                                                                            
                                                                            
  Not using Mixmax yet?                                                     
                                                                            
                                                                            
 post request on livy:8998/session                                          
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 {                                                                          
 "kind": "pyspark",                                                         
 "proxyUser": "root",                                                       
 "conf": {                                                                  
 "spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",     
 "spark.executor.cores": 2,                                                 
 "spark.total.executor.cores": 2,                                           
 "livy.spark.driver.cores": 2,                                              
 "livy.spark.executor.cores": 2,                                            
 "livy.spark.executor.instances": 1                                         
 },                                                                         
 "executorMemory": "1G",                                                    
 "executorCores": 2,                                                        
 "numExecutors": 1,                                                         
 "driverCores": 1,                                                          
 "driverMemory": "1G"                                                       
 }                                                                          
                                                                            
                                                                            
                                                                            
                                                                            
  Not using Mixmax yet?                                                     
                                                                            
                                                                            
                                                                            
 Is there any configuration I can do to limit the cores, so that I can run  
 multiple sessions on same cluster?                                         
                                                                            
 Regards                                                                    
 Junaid                                                                     
                                                                            









Re: Session taking all the available resources even with number of cores specified

Posted by Junaid Nasir <jn...@an10.io>.
Hello everyone,
I don't have access to create a JIRA issue on livy project. can someone please
create a ticket around this?
Issue: Livy doesn't support "total-executor-cores" settings for spark standalone
clusterThis will enable livy to run multiple spark executors on standalone
cluster.
Thanks,Junaid  





On Mon, Jan 15, 2018 12:22 PM, Junaid Nasir jnasir@an10.io  wrote:
Hi Alex,
Thank you for taking your time to investigate this. Will create a jira ticket
and try to help with PR.  





On Sat, Jan 13, 2018 1:49 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:
Hey Junaid,

I did some digging in both Livy and Spark code and you are correct about the
issue being with total-executor-nodes. There is no such conf as
"spark.total.executor.nodes" in Spark as it only exists as a param to
spark-submit (which Spark then maps to the conf "spark.cores.max"). Also
currently Livy does not support that param when constructing it's spark-submit,
which would be useful and I would recommend opening a JIRA to address it.

I can also point out a few issues with you POST for future reference while I'm
at it:
- The three "livy.spark.*" confs in your POST don't exist and would not be used,
also the value of "executorCores" is mapped to the conf "spark.executor.cores"
making it redundant
- Similarly "numExecutors" and "driverCores" map to "spark.executor.instances"
and "spark.driver.cores" respectively, which I believe you tried setting with
"livy.spark.*" formatted confs


So in summary you should open up a JIRA for Livy to support the
"total-executor-nodes" param for spark-submit and for now you can see if setting
"spark.cores.max" in the conf works as a work-around (though I'm not hopeful)


Alex Bozarth
Software Engineer
Spark Technology Center  
--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States


Junaid Nasir ---01/11/2018 09:54:57 PM---Hi Alex,It is working correctly when i
create spark session using bin/pyspark. I can create multiple

From: Junaid Nasir <jn...@an10.io>
To: user@livy.incubator.apache.org
Date: 01/11/2018 09:54 PM
Subject: Re: Session taking all the available resources even with number of
cores specified



--------------------------------------------------------------------------------



Hi Alex,
It is working correctly when i create spark session using bin/pyspark. I can
create multiple sessions as well. command which enable this is
"--total-executor-cores 4". I think livy is not passing it to Spark?

We are using DCOS and livy doesn't work with spark mesos deployment, so had to
setup standalone spark cluster.  



On Fri, Jan 12, 2018 2:57 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:  * Hi
   Junaid,
 * 
 * 
   From my experience this is an issue with the Spark stand-alone cluster, which
   is why Livy is recommended to run with YARN instead, which should allocate
   resources properly. @Jerry, you have seen more production uses of LIvy than
   me, am I correct that Livy with a stand-alone Spark cluster can't handle
   multiple sessions?
 * 
 * 
 * Alex
   Bozarth
   Software Engineer
   Spark Technology Center


--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States   * 
 * 
 * 
 * 
 * Junaid Nasir ---01/11/2018 02:22:55 AM---Hi everyone, I am
   using livy 0.4 with Spark 2.1.0 standalone cluster.I can create sessions
 * 
 * 
   From: 
 * Junaid Nasir <jn...@an10.io>
 * 
   To: 
 * user@livy.incubator.apache.org
 * 
   Date: 
 * 01/11/2018 02:22 AM
 * 
   Subject: 
 * Session taking all the available resources even with number of
   cores specified
 * 
 * 
 * 
 * Hi everyone,
   
   I am using livy 0.4 with Spark 2.1.0 standalone cluster.
   I can create sessions and run jobs. but 1 session takes up all the available
   resources. I have tried setting up executorCores, numExecutors as well as
   spark.total.executor.cores. this command works fine when running a session
   from cmd line
   
   
   
   
   ./spark-2.1.0/bin/pyspark --master spark://master:7077  --executor-cores 2  
   --num-executors 1  --total-executor-cores 4
   
   Not using Mixmax yet?  
   post request on livy:8998/session
   
   
   
   
   
   {
   "kind": "pyspark",
   "proxyUser": "root",
   "conf": {
   "spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",
   "spark.executor.cores": 2,
   "spark.total.executor.cores": 2,
   "livy.spark.driver.cores": 2,
   "livy.spark.executor.cores": 2,
   "livy.spark.executor.instances": 1
   },
   "executorMemory": "1G",
   "executorCores": 2,
   "numExecutors": 1,
   "driverCores": 1,
   "driverMemory": "1G"
   }
   
   Not using Mixmax yet?  
   
   Is there any configuration I can do to limit the cores, so that I can run
   multiple sessions on same cluster?
   
   Regards
   Junaid
 * 
 * 
 * 
 *

Re: Session taking all the available resources even with number of cores specified

Posted by Junaid Nasir <jn...@an10.io>.
Hi Alex,
Thank you for taking your time to investigate this. Will create a jira ticket
and try to help with PR.  





On Sat, Jan 13, 2018 1:49 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:
Hey Junaid,

I did some digging in both Livy and Spark code and you are correct about the
issue being with total-executor-nodes. There is no such conf as
"spark.total.executor.nodes" in Spark as it only exists as a param to
spark-submit (which Spark then maps to the conf "spark.cores.max"). Also
currently Livy does not support that param when constructing it's spark-submit,
which would be useful and I would recommend opening a JIRA to address it.

I can also point out a few issues with you POST for future reference while I'm
at it:
- The three "livy.spark.*" confs in your POST don't exist and would not be used,
also the value of "executorCores" is mapped to the conf "spark.executor.cores"
making it redundant
- Similarly "numExecutors" and "driverCores" map to "spark.executor.instances"
and "spark.driver.cores" respectively, which I believe you tried setting with
"livy.spark.*" formatted confs


So in summary you should open up a JIRA for Livy to support the
"total-executor-nodes" param for spark-submit and for now you can see if setting
"spark.cores.max" in the conf works as a work-around (though I'm not hopeful)


Alex Bozarth
Software Engineer
Spark Technology Center  
--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States


Junaid Nasir ---01/11/2018 09:54:57 PM---Hi Alex,It is working correctly when i
create spark session using bin/pyspark. I can create multiple

From: Junaid Nasir <jn...@an10.io>
To: user@livy.incubator.apache.org
Date: 01/11/2018 09:54 PM
Subject: Re: Session taking all the available resources even with number of
cores specified



--------------------------------------------------------------------------------



Hi Alex,
It is working correctly when i create spark session using bin/pyspark. I can
create multiple sessions as well. command which enable this is
"--total-executor-cores 4". I think livy is not passing it to Spark?

We are using DCOS and livy doesn't work with spark mesos deployment, so had to
setup standalone spark cluster.  



On Fri, Jan 12, 2018 2:57 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:  * Hi
   Junaid,
 * 
 * 
   From my experience this is an issue with the Spark stand-alone cluster, which
   is why Livy is recommended to run with YARN instead, which should allocate
   resources properly. @Jerry, you have seen more production uses of LIvy than
   me, am I correct that Livy with a stand-alone Spark cluster can't handle
   multiple sessions?
 * 
 * 
 * Alex
   Bozarth
   Software Engineer
   Spark Technology Center


--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States   * 
 * 
 * 
 * 
 * Junaid Nasir ---01/11/2018 02:22:55 AM---Hi everyone, I am
   using livy 0.4 with Spark 2.1.0 standalone cluster.I can create sessions
 * 
 * 
   From: 
 * Junaid Nasir <jn...@an10.io>
 * 
   To: 
 * user@livy.incubator.apache.org
 * 
   Date: 
 * 01/11/2018 02:22 AM
 * 
   Subject: 
 * Session taking all the available resources even with number of
   cores specified
 * 
 * 
 * 
 * Hi everyone,
   
   I am using livy 0.4 with Spark 2.1.0 standalone cluster.
   I can create sessions and run jobs. but 1 session takes up all the available
   resources. I have tried setting up executorCores, numExecutors as well as
   spark.total.executor.cores. this command works fine when running a session
   from cmd line
   
   
   
   
   ./spark-2.1.0/bin/pyspark --master spark://master:7077  --executor-cores 2  
   --num-executors 1  --total-executor-cores 4
   
   Not using Mixmax yet?  
   post request on livy:8998/session
   
   
   
   
   
   {
   "kind": "pyspark",
   "proxyUser": "root",
   "conf": {
   "spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",
   "spark.executor.cores": 2,
   "spark.total.executor.cores": 2,
   "livy.spark.driver.cores": 2,
   "livy.spark.executor.cores": 2,
   "livy.spark.executor.instances": 1
   },
   "executorMemory": "1G",
   "executorCores": 2,
   "numExecutors": 1,
   "driverCores": 1,
   "driverMemory": "1G"
   }
   
   Not using Mixmax yet?  
   
   Is there any configuration I can do to limit the cores, so that I can run
   multiple sessions on same cluster?
   
   Regards
   Junaid
 * 
 * 
 * 
 *

Re: Session taking all the available resources even with number of cores specified

Posted by Alex Bozarth <aj...@us.ibm.com>.
Hey Junaid,

I did some digging in both Livy and Spark code and you are correct about
the issue being with total-executor-nodes. There is no such conf as
"spark.total.executor.nodes" in Spark as it only exists as a param to
spark-submit (which Spark then maps to the conf "spark.cores.max"). Also
currently Livy does not support that param when constructing it's
spark-submit, which would be useful and I would recommend opening a JIRA to
address it.

I can also point out a few issues with you POST for future reference while
I'm at it:
- The three "livy.spark.*" confs in your POST don't exist and would not be
used, also the value of "executorCores" is mapped to the conf
"spark.executor.cores" making it redundant
- Similarly "numExecutors" and "driverCores" map to
"spark.executor.instances" and "spark.driver.cores" respectively, which I
believe you tried setting with "livy.spark.*" formatted confs


So in summary you should open up a JIRA for Livy to support the
"total-executor-nodes" param for spark-submit and for now you can see if
setting "spark.cores.max" in the conf works as a work-around (though I'm
not hopeful)

                                                                                   
 Alex Bozarth                                                                      
 Software Engineer                                                                 
 Spark Technology Center                                                           
                                                                                   
                                                                                     
                                                                                     
                                                                                     
 E-mail: ajbozart@us.ibm.com                                                         
 GitHub: github.com/ajbozarth                                                        
                                                                   505 Howard Street 
                                                             San Francisco, CA 94105 
                                                                       United States 
                                                                                     








From:	Junaid Nasir <jn...@an10.io>
To:	user@livy.incubator.apache.org
Date:	01/11/2018 09:54 PM
Subject:	Re: Session taking all the available resources even with number
            of cores specified


                                                                            
 Hi Alex,                                                                   
 It is working correctly when i create spark session using bin/pyspark. I   
 can create multiple sessions as well. command which enable this is         
 "--total-executor-cores 4". I think livy is not passing it to Spark?       
                                                                            
 We are using DCOS and livy doesn't work with spark mesos deployment, so    
 had to setup standalone spark cluster.                                     
                                                                            
                                                                            










On Fri, Jan 12, 2018 2:57 AM, Alex Bozarth ajbozart@us.ibm.com wrote:
  Hi Junaid,

  From my experience this is an issue with the Spark stand-alone cluster,
  which is why Livy is recommended to run with YARN instead, which should
  allocate resources properly. @Jerry, you have seen more production uses
  of LIvy than me, am I correct that Livy with a stand-alone Spark cluster
  can't handle multiple sessions?

                                                                                   
 Alex Bozarth                                                                      
 Software Engineer                                                                 
 Spark Technology Center                                                           
                                                                                   




                                                                                    
                                                                                    
                                                                                    
 E-mail: ajbozart@us.ibm.com                                                        
 GitHub: github.com/ajbozarth                                                       
                                                                  505 Howard Street 
                                                            San Francisco, CA 94105 
                                                                      United States 
                                                                                    





  Inactive hide details for Junaid Nasir ---01/11/2018 02:22:55 AM---Hi
  everyone, I am using livy 0.4 with Spark 2.1.0 standaloneJunaid Nasir
  ---01/11/2018 02:22:55 AM---Hi everyone, I am using livy 0.4 with Spark
  2.1.0 standalone cluster.I can create sessions

  From: Junaid Nasir <jn...@an10.io>
  To: user@livy.incubator.apache.org
  Date: 01/11/2018 02:22 AM
  Subject: Session taking all the available resources even with number of
  cores specified

                                                                            
 Hi everyone,                                                               
                                                                            
 I am using livy 0.4 with Spark 2.1.0 standalone cluster.                   
 I can create sessions and run jobs. but 1 session takes up all the         
 available resources. I have tried setting up executorCores, numExecutors   
 as well as spark.total.executor.cores. this command works fine when        
 running a session from cmd line                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 ./spark-2.1.0/bin/pyspark --master spark://master:7077 --executor-cores 2  
 --num-executors 1 --total-executor-cores 4                                 
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
  Not using Mixmax yet?                                                     
                                                                            
                                                                            
 post request on livy:8998/session                                          
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 {                                                                          
 "kind": "pyspark",                                                         
 "proxyUser": "root",                                                       
 "conf": {                                                                  
 "spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",     
 "spark.executor.cores": 2,                                                 
 "spark.total.executor.cores": 2,                                           
 "livy.spark.driver.cores": 2,                                              
 "livy.spark.executor.cores": 2,                                            
 "livy.spark.executor.instances": 1                                         
 },                                                                         
 "executorMemory": "1G",                                                    
 "executorCores": 2,                                                        
 "numExecutors": 1,                                                         
 "driverCores": 1,                                                          
 "driverMemory": "1G"                                                       
 }                                                                          
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
  Not using Mixmax yet?                                                     
                                                                            
                                                                            
                                                                            
 Is there any configuration I can do to limit the cores, so that I can run  
 multiple sessions on same cluster?                                         
                                                                            
 Regards                                                                    
 Junaid                                                                     
                                                                            







Re: Session taking all the available resources even with number of cores specified

Posted by Junaid Nasir <jn...@an10.io>.
Hi Alex,It is working correctly when i create spark session using bin/pyspark.
I can create multiple sessions as well. command which enable this is
"--total-executor-cores 4". I think livy is not passing it to Spark?
We are using DCOS and livy doesn't work with spark mesos deployment, so had to
setup standalone spark cluster.  





On Fri, Jan 12, 2018 2:57 AM, Alex Bozarth ajbozart@us.ibm.com  wrote:
Hi Junaid,

From my experience this is an issue with the Spark stand-alone cluster, which is
why Livy is recommended to run with YARN instead, which should allocate
resources properly. @Jerry, you have seen more production uses of LIvy than me,
am I correct that Livy with a stand-alone Spark cluster can't handle multiple
sessions?


Alex Bozarth
Software Engineer
Spark Technology Center  
--------------------------------------------------------------------------------

E-mail:  ajbozart@us.ibm.com
GitHub: github.com/ajbozarth

505 Howard Street
San Francisco, CA 94105
United States


Junaid Nasir ---01/11/2018 02:22:55 AM---Hi everyone, I am using livy 0.4 with
Spark 2.1.0 standalone cluster.I can create sessions

From: Junaid Nasir <jn...@an10.io>
To: user@livy.incubator.apache.org
Date: 01/11/2018 02:22 AM
Subject: Session taking all the available resources even with number of cores
specified



--------------------------------------------------------------------------------



Hi everyone,

I am using livy 0.4 with Spark 2.1.0 standalone cluster.
I can create sessions and run jobs. but 1 session takes up all the available
resources. I have tried setting up executorCores, numExecutors as well as
spark.total.executor.cores. this command works fine when running a session from
cmd line





./spark-2.1.0/bin/pyspark --master spark://master:7077  --executor-cores 2  --
num-executors 1  --total-executor-cores 4


Not using Mixmax yet?  
post request on livy:8998/session






{
"kind": "pyspark",
"proxyUser": "root",
"conf": {
"spark.cassandra.connection.host": "10.128.1.1,10.128.1.2,10.128.1.3",
"spark.executor.cores": 2,
"spark.total.executor.cores": 2,
"livy.spark.driver.cores": 2,
"livy.spark.executor.cores": 2,
"livy.spark.executor.instances": 1
},
"executorMemory": "1G",
"executorCores": 2,
"numExecutors": 1,
"driverCores": 1,
"driverMemory": "1G"
}


Not using Mixmax yet?  

Is there any configuration I can do to limit the cores, so that I can run
multiple sessions on same cluster?

Regards
Junaid

Re: Session taking all the available resources even with number of cores specified

Posted by Alex Bozarth <aj...@us.ibm.com>.
Hi Junaid,

From my experience this is an issue with the Spark stand-alone cluster,
which is why Livy is recommended to run with YARN instead, which should
allocate resources properly. @Jerry, you have seen more production uses of
LIvy than me, am I correct that Livy with a stand-alone Spark cluster can't
handle multiple sessions?

                                                                                   
 Alex Bozarth                                                                      
 Software Engineer                                                                 
 Spark Technology Center                                                           
                                                                                   
                                                                                     
                                                                                     
                                                                                     
 E-mail: ajbozart@us.ibm.com                                                         
 GitHub: github.com/ajbozarth                                                        
                                                                   505 Howard Street 
                                                             San Francisco, CA 94105 
                                                                       United States 
                                                                                     








From:	Junaid Nasir <jn...@an10.io>
To:	user@livy.incubator.apache.org
Date:	01/11/2018 02:22 AM
Subject:	Session taking all the available resources even with number of
            cores specified


                                                                            
 Hi everyone,                                                               
                                                                            
 I am using livy 0.4 with Spark 2.1.0 standalone cluster.                   
 I can create sessions and run jobs. but 1 session takes up all the         
 available resources. I have tried setting up executorCores, numExecutors   
 as well as spark.total.executor.cores. this command works fine when        
 running a session from cmd line                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 ./spark-2.1.0/bin/pyspark --master spark://master:7077  --executor-cores 2 
 --num-executors 1 --total-executor-cores 4                                 
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 Not using Mixmax yet?                                                      
                                                                            
                                                                            
 post request on livy:8998/session                                          
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 {                                                                          
     "kind": "pyspark",                                                     
     "proxyUser": "root",                                                   
     "conf": {                                                              
         "spark.cassandra.connection.host":                                 
 "10.128.1.1,10.128.1.2,10.128.1.3",                                        
         "spark.executor.cores": 2,                                         
         "spark.total.executor.cores": 2,                                   
         "livy.spark.driver.cores": 2,                                      
         "livy.spark.executor.cores": 2,                                    
         "livy.spark.executor.instances": 1                                 
     },                                                                     
     "executorMemory": "1G",                                                
     "executorCores": 2,                                                    
     "numExecutors": 1,                                                     
     "driverCores": 1,                                                      
     "driverMemory": "1G"                                                   
 }                                                                          
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
                                                                            
 Not using Mixmax yet?                                                      
                                                                            
                                                                            
                                                                            
 Is there any configuration I can do to limit the cores, so that I can run  
 multiple sessions on same cluster?                                         
                                                                            
 Regards                                                                    
 Junaid