You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@livy.apache.org by Jeff Zhang <zj...@gmail.com> on 2019/07/06 13:26:06 UTC

Re: Livy Sessions

For the dead/killed session, could you check the yarn app logs ?

Hugo Herlanin <hu...@lendico.com.br> 于2019年7月4日周四 下午9:41写道：

>
> Hey, user mail is not working out!
>
> I am having some problems with livy setup. My use case is as follows: I
> use a DAG in airflow (1.10) to create a cluster in EMR (5.24.1, one master
> is m4.large and two nodes in m5a.xlarge), and when it is ready,  this dag
> sends 5 to 7 simultaneous requests to Livy. I think I'm not messing with
> the Livy settings, I  just set livy.spark.deploy-mode = client and
> livy.repl.enable-hive-context = true.
>
> The problem is that from these ~ 5 to 7 sessions, just one or two opens
> (goes to 'idle') and all others go straight to 'dead' or 'killed', in logs
> Yarn returns that the sessions were killed by 'livy' user. I tried to
> tinker with all possible timeout settings, but this is still happening. If
> I send more than ~10 simultaneous requests, livy responds with 500, and if
> I continue sending requests, the server freezes. This happens even if EMR
> has enough resources available.
>
> I know the cluster is able to handle that many questions because it works
> when I open them via a loop with an interval of 15 seconds or more, but it
> feels like livy should be able to deal with that many requests
> simultaneously. It seems strange that I should need to manage the queue in
> such a way for an API of a distributed system.
>
> Do you have any clue about where I might be doing wrong? Is there any
> known limitation that I'm unaware of?
>
> Best,
>
> Hugo Herlanin
>
>

-- 
Best Regards

Jeff Zhang

Re: Livy Sessions

Posted by Marco Gaido <ma...@gmail.com>.

Hi Carlos,

I meant profiling the Livy server with a JVM profiler.

Thanks,
Bests.
Marco

Il giorno mar 9 lug 2019 alle ore 00:18 Kadu Vido <
carlos.vido@lendico.com.br> ha scritto:

> Hi, Marco,
>
> We're using livy 0.6.0, which I'm afraid ships by default with EMR 5.24,
> so I cannot test a former version.
>
> It's a holiday in Brazil and our network is IP-gated (which is also why I
> don't have logs), so we won't be able to access it tomorrow. I'll run that
> profiling as soon as we're back on Wednesday. I'm assuming you want a *perf
> record*, let me know otherwise.
>
> *Carlos Vido *
>
> Data Engineer @ Lendico Brasil <https://www.lendico.com.br>
>
>
> On Mon, 8 Jul 2019 at 16:55, Marco Gaido <ma...@gmail.com> wrote:
>
>> Hi all,
>>
>> Seems like a perf issue in livy server. I assume you are using a recent
>> version of livy.
>>
>> If this is he case, may you profile livy server in order to understand
>> which is the problem?
>>
>> Thanks,
>> Marco
>>
>> On Mon, 8 Jul 2019, 21:03 Kadu Vido, <ca...@lendico.com.br> wrote:
>>
>>> Hi, I'm working with Hugo in the same project.
>>>
>>> Shubham, we're using almost the same setup, only difference is Airflow
>>> 1.10.1. I coded a workaround in our Livy hook, it has a parameter for
>>> retries and whenever the session returns anything different from 'idle', we
>>> try again before failing the task. It's not ideal but at least our
>>> pipelines aren't stuck anymore.
>>>
>>> Zhang, I don't have yarn logs in hand but I can search for them if you'd
>>> like to take a look. However, our latest clues point a different way:
>>>
>>> 1 - running *top* on the master node, we observed that LIvy rapidly
>>> takes all the available CPUs after we send just a few requests (3 or 4
>>> already cause this to happen, if we send upwards of 10, it'll crash the
>>> service).
>>>
>>> 2. We can get around this spacing them out a bit -- that is, if we use a
>>> loop to open the sessions and wait ~10s betwen them, it'll give Livy enough
>>> time to release the CPU resources before trying to open a new one. We've
>>> had help from some AWS engineers that tried on several instance sizes and
>>> found out that on larger instances they can try to open 10 or 12
>>> simultaneously, but:
>>>
>>> 3. Regardless of the size of the cluster, we cannot hold more than 9
>>> simultaneous sessions open. It doesn't matter if our cluster has enough
>>> vCPUs or RAM to handle more, and the size of the master node doesn't matter
>>> either: from the 10th session onwards, each one seems to either die or drop.
>>>
>>> *Carlos Vido *
>>>
>>> Data Engineer @ Lendico Brasil <https://www.lendico.com.br>
>>>
>>>
>>> On Sat, 6 Jul 2019 at 13:30, Shubham Gupta <y2...@gmail.com>
>>> wrote:
>>>
>>>> I'm facing precisely same issue.
>>>> .
>>>> I've written a LivySessionHook that's just a wrapper over PyLivy
>>>> Session <https://pylivy.readthedocs.io/en/latest/api/session.html>.
>>>>
>>>>    - I'm able to use this hook to send code-snippets to remote EMR via
>>>>    Python shell a few times, after which it starts throwing "caught
>>>>    exception 500 Server Error: Internal Server Error for url" (and
>>>>    continues to do so for next hour or so).
>>>>    - However when the same hook is triggered via Airflow operator, I
>>>>    get absolutely no success (always results in 500 error).
>>>>
>>>> .
>>>> I'm using
>>>>
>>>>    - Airflow 1.10.3
>>>>    - Python 3.7.3
>>>>    - EMR 5.24.1
>>>>    - Livy 0.6.0
>>>>    - Spark 2.4.2
>>>>
>>>>
>>>> *Shubham Gupta*
>>>> Software Engineer
>>>>  zomato
>>>>
>>>>
>>>> On Sat, Jul 6, 2019 at 6:56 PM Jeff Zhang <zj...@gmail.com> wrote:
>>>>
>>>>> For the dead/killed session, could you check the yarn app logs ?
>>>>>
>>>>> Hugo Herlanin <hu...@lendico.com.br> 于2019年7月4日周四 下午9:41写道：
>>>>>
>>>>>>
>>>>>> Hey, user mail is not working out!
>>>>>>
>>>>>> I am having some problems with livy setup. My use case is as follows:
>>>>>> I use a DAG in airflow (1.10) to create a cluster in EMR (5.24.1, one
>>>>>> master is m4.large and two nodes in m5a.xlarge), and when it is ready,
>>>>>> this dag sends 5 to 7 simultaneous requests to Livy. I think I'm not
>>>>>> messing with the Livy settings, I  just set livy.spark.deploy-mode = client
>>>>>> and
>>>>>> livy.repl.enable-hive-context = true.
>>>>>>
>>>>>> The problem is that from these ~ 5 to 7 sessions, just one or two
>>>>>> opens (goes to 'idle') and all others go straight to 'dead' or 'killed', in
>>>>>> logs  Yarn returns that the sessions were killed by 'livy' user. I tried to
>>>>>> tinker with all possible timeout settings, but this is still happening. If
>>>>>> I send more than ~10 simultaneous requests, livy responds with 500, and if
>>>>>> I continue sending requests, the server freezes. This happens even if EMR
>>>>>> has enough resources available.
>>>>>>
>>>>>> I know the cluster is able to handle that many questions because it
>>>>>> works when I open them via a loop with an interval of 15 seconds or more,
>>>>>> but it feels like livy should be able to deal with that many requests
>>>>>> simultaneously. It seems strange that I should need to manage the queue in
>>>>>> such a way for an API of a distributed system.
>>>>>>
>>>>>> Do you have any clue about where I might be doing wrong? Is there any
>>>>>> known limitation that I'm unaware of?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Hugo Herlanin
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>>>
>>>>

Re: Livy Sessions

Posted by Marco Gaido <ma...@gmail.com>.

Hi Carlos,

I meant profiling the Livy server with a JVM profiler.

Thanks,
Bests.
Marco

Il giorno mar 9 lug 2019 alle ore 00:18 Kadu Vido <
carlos.vido@lendico.com.br> ha scritto:

> Hi, Marco,
>
> We're using livy 0.6.0, which I'm afraid ships by default with EMR 5.24,
> so I cannot test a former version.
>
> It's a holiday in Brazil and our network is IP-gated (which is also why I
> don't have logs), so we won't be able to access it tomorrow. I'll run that
> profiling as soon as we're back on Wednesday. I'm assuming you want a *perf
> record*, let me know otherwise.
>
> *Carlos Vido *
>
> Data Engineer @ Lendico Brasil <https://www.lendico.com.br>
>
>
> On Mon, 8 Jul 2019 at 16:55, Marco Gaido <ma...@gmail.com> wrote:
>
>> Hi all,
>>
>> Seems like a perf issue in livy server. I assume you are using a recent
>> version of livy.
>>
>> If this is he case, may you profile livy server in order to understand
>> which is the problem?
>>
>> Thanks,
>> Marco
>>
>> On Mon, 8 Jul 2019, 21:03 Kadu Vido, <ca...@lendico.com.br> wrote:
>>
>>> Hi, I'm working with Hugo in the same project.
>>>
>>> Shubham, we're using almost the same setup, only difference is Airflow
>>> 1.10.1. I coded a workaround in our Livy hook, it has a parameter for
>>> retries and whenever the session returns anything different from 'idle', we
>>> try again before failing the task. It's not ideal but at least our
>>> pipelines aren't stuck anymore.
>>>
>>> Zhang, I don't have yarn logs in hand but I can search for them if you'd
>>> like to take a look. However, our latest clues point a different way:
>>>
>>> 1 - running *top* on the master node, we observed that LIvy rapidly
>>> takes all the available CPUs after we send just a few requests (3 or 4
>>> already cause this to happen, if we send upwards of 10, it'll crash the
>>> service).
>>>
>>> 2. We can get around this spacing them out a bit -- that is, if we use a
>>> loop to open the sessions and wait ~10s betwen them, it'll give Livy enough
>>> time to release the CPU resources before trying to open a new one. We've
>>> had help from some AWS engineers that tried on several instance sizes and
>>> found out that on larger instances they can try to open 10 or 12
>>> simultaneously, but:
>>>
>>> 3. Regardless of the size of the cluster, we cannot hold more than 9
>>> simultaneous sessions open. It doesn't matter if our cluster has enough
>>> vCPUs or RAM to handle more, and the size of the master node doesn't matter
>>> either: from the 10th session onwards, each one seems to either die or drop.
>>>
>>> *Carlos Vido *
>>>
>>> Data Engineer @ Lendico Brasil <https://www.lendico.com.br>
>>>
>>>
>>> On Sat, 6 Jul 2019 at 13:30, Shubham Gupta <y2...@gmail.com>
>>> wrote:
>>>
>>>> I'm facing precisely same issue.
>>>> .
>>>> I've written a LivySessionHook that's just a wrapper over PyLivy
>>>> Session <https://pylivy.readthedocs.io/en/latest/api/session.html>.
>>>>
>>>>    - I'm able to use this hook to send code-snippets to remote EMR via
>>>>    Python shell a few times, after which it starts throwing "caught
>>>>    exception 500 Server Error: Internal Server Error for url" (and
>>>>    continues to do so for next hour or so).
>>>>    - However when the same hook is triggered via Airflow operator, I
>>>>    get absolutely no success (always results in 500 error).
>>>>
>>>> .
>>>> I'm using
>>>>
>>>>    - Airflow 1.10.3
>>>>    - Python 3.7.3
>>>>    - EMR 5.24.1
>>>>    - Livy 0.6.0
>>>>    - Spark 2.4.2
>>>>
>>>>
>>>> *Shubham Gupta*
>>>> Software Engineer
>>>>  zomato
>>>>
>>>>
>>>> On Sat, Jul 6, 2019 at 6:56 PM Jeff Zhang <zj...@gmail.com> wrote:
>>>>
>>>>> For the dead/killed session, could you check the yarn app logs ?
>>>>>
>>>>> Hugo Herlanin <hu...@lendico.com.br> 于2019年7月4日周四 下午9:41写道：
>>>>>
>>>>>>
>>>>>> Hey, user mail is not working out!
>>>>>>
>>>>>> I am having some problems with livy setup. My use case is as follows:
>>>>>> I use a DAG in airflow (1.10) to create a cluster in EMR (5.24.1, one
>>>>>> master is m4.large and two nodes in m5a.xlarge), and when it is ready,
>>>>>> this dag sends 5 to 7 simultaneous requests to Livy. I think I'm not
>>>>>> messing with the Livy settings, I  just set livy.spark.deploy-mode = client
>>>>>> and
>>>>>> livy.repl.enable-hive-context = true.
>>>>>>
>>>>>> The problem is that from these ~ 5 to 7 sessions, just one or two
>>>>>> opens (goes to 'idle') and all others go straight to 'dead' or 'killed', in
>>>>>> logs  Yarn returns that the sessions were killed by 'livy' user. I tried to
>>>>>> tinker with all possible timeout settings, but this is still happening. If
>>>>>> I send more than ~10 simultaneous requests, livy responds with 500, and if
>>>>>> I continue sending requests, the server freezes. This happens even if EMR
>>>>>> has enough resources available.
>>>>>>
>>>>>> I know the cluster is able to handle that many questions because it
>>>>>> works when I open them via a loop with an interval of 15 seconds or more,
>>>>>> but it feels like livy should be able to deal with that many requests
>>>>>> simultaneously. It seems strange that I should need to manage the queue in
>>>>>> such a way for an API of a distributed system.
>>>>>>
>>>>>> Do you have any clue about where I might be doing wrong? Is there any
>>>>>> known limitation that I'm unaware of?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Hugo Herlanin
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Best Regards
>>>>>
>>>>> Jeff Zhang
>>>>>
>>>>

Re: Livy Sessions

Posted by Marco Gaido <ma...@gmail.com>.

Hi all,

Seems like a perf issue in livy server. I assume you are using a recent
version of livy.

If this is he case, may you profile livy server in order to understand
which is the problem?

Thanks,
Marco

On Mon, 8 Jul 2019, 21:03 Kadu Vido, <ca...@lendico.com.br> wrote:

> Hi, I'm working with Hugo in the same project.
>
> Shubham, we're using almost the same setup, only difference is Airflow
> 1.10.1. I coded a workaround in our Livy hook, it has a parameter for
> retries and whenever the session returns anything different from 'idle', we
> try again before failing the task. It's not ideal but at least our
> pipelines aren't stuck anymore.
>
> Zhang, I don't have yarn logs in hand but I can search for them if you'd
> like to take a look. However, our latest clues point a different way:
>
> 1 - running *top* on the master node, we observed that LIvy rapidly takes
> all the available CPUs after we send just a few requests (3 or 4 already
> cause this to happen, if we send upwards of 10, it'll crash the service).
>
> 2. We can get around this spacing them out a bit -- that is, if we use a
> loop to open the sessions and wait ~10s betwen them, it'll give Livy enough
> time to release the CPU resources before trying to open a new one. We've
> had help from some AWS engineers that tried on several instance sizes and
> found out that on larger instances they can try to open 10 or 12
> simultaneously, but:
>
> 3. Regardless of the size of the cluster, we cannot hold more than 9
> simultaneous sessions open. It doesn't matter if our cluster has enough
> vCPUs or RAM to handle more, and the size of the master node doesn't matter
> either: from the 10th session onwards, each one seems to either die or drop.
>
> *Carlos Vido *
>
> Data Engineer @ Lendico Brasil <https://www.lendico.com.br>
>
>
> On Sat, 6 Jul 2019 at 13:30, Shubham Gupta <y2...@gmail.com>
> wrote:
>
>> I'm facing precisely same issue.
>> .
>> I've written a LivySessionHook that's just a wrapper over PyLivy Session
>> <https://pylivy.readthedocs.io/en/latest/api/session.html>.
>>
>>    - I'm able to use this hook to send code-snippets to remote EMR via
>>    Python shell a few times, after which it starts throwing "caught
>>    exception 500 Server Error: Internal Server Error for url" (and
>>    continues to do so for next hour or so).
>>    - However when the same hook is triggered via Airflow operator, I get
>>    absolutely no success (always results in 500 error).
>>
>> .
>> I'm using
>>
>>    - Airflow 1.10.3
>>    - Python 3.7.3
>>    - EMR 5.24.1
>>    - Livy 0.6.0
>>    - Spark 2.4.2
>>
>>
>> *Shubham Gupta*
>> Software Engineer
>>  zomato
>>
>>
>> On Sat, Jul 6, 2019 at 6:56 PM Jeff Zhang <zj...@gmail.com> wrote:
>>
>>> For the dead/killed session, could you check the yarn app logs ?
>>>
>>> Hugo Herlanin <hu...@lendico.com.br> 于2019年7月4日周四 下午9:41写道：
>>>
>>>>
>>>> Hey, user mail is not working out!
>>>>
>>>> I am having some problems with livy setup. My use case is as follows: I
>>>> use a DAG in airflow (1.10) to create a cluster in EMR (5.24.1, one master
>>>> is m4.large and two nodes in m5a.xlarge), and when it is ready,  this dag
>>>> sends 5 to 7 simultaneous requests to Livy. I think I'm not messing with
>>>> the Livy settings, I  just set livy.spark.deploy-mode = client and
>>>> livy.repl.enable-hive-context = true.
>>>>
>>>> The problem is that from these ~ 5 to 7 sessions, just one or two opens
>>>> (goes to 'idle') and all others go straight to 'dead' or 'killed', in logs
>>>> Yarn returns that the sessions were killed by 'livy' user. I tried to
>>>> tinker with all possible timeout settings, but this is still happening. If
>>>> I send more than ~10 simultaneous requests, livy responds with 500, and if
>>>> I continue sending requests, the server freezes. This happens even if EMR
>>>> has enough resources available.
>>>>
>>>> I know the cluster is able to handle that many questions because it
>>>> works when I open them via a loop with an interval of 15 seconds or more,
>>>> but it feels like livy should be able to deal with that many requests
>>>> simultaneously. It seems strange that I should need to manage the queue in
>>>> such a way for an API of a distributed system.
>>>>
>>>> Do you have any clue about where I might be doing wrong? Is there any
>>>> known limitation that I'm unaware of?
>>>>
>>>> Best,
>>>>
>>>> Hugo Herlanin
>>>>
>>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>

Re: Livy Sessions

Posted by Marco Gaido <ma...@gmail.com>.

Hi all,

Seems like a perf issue in livy server. I assume you are using a recent
version of livy.

If this is he case, may you profile livy server in order to understand
which is the problem?

Thanks,
Marco

On Mon, 8 Jul 2019, 21:03 Kadu Vido, <ca...@lendico.com.br> wrote:

> Hi, I'm working with Hugo in the same project.
>
> Shubham, we're using almost the same setup, only difference is Airflow
> 1.10.1. I coded a workaround in our Livy hook, it has a parameter for
> retries and whenever the session returns anything different from 'idle', we
> try again before failing the task. It's not ideal but at least our
> pipelines aren't stuck anymore.
>
> Zhang, I don't have yarn logs in hand but I can search for them if you'd
> like to take a look. However, our latest clues point a different way:
>
> 1 - running *top* on the master node, we observed that LIvy rapidly takes
> all the available CPUs after we send just a few requests (3 or 4 already
> cause this to happen, if we send upwards of 10, it'll crash the service).
>
> 2. We can get around this spacing them out a bit -- that is, if we use a
> loop to open the sessions and wait ~10s betwen them, it'll give Livy enough
> time to release the CPU resources before trying to open a new one. We've
> had help from some AWS engineers that tried on several instance sizes and
> found out that on larger instances they can try to open 10 or 12
> simultaneously, but:
>
> 3. Regardless of the size of the cluster, we cannot hold more than 9
> simultaneous sessions open. It doesn't matter if our cluster has enough
> vCPUs or RAM to handle more, and the size of the master node doesn't matter
> either: from the 10th session onwards, each one seems to either die or drop.
>
> *Carlos Vido *
>
> Data Engineer @ Lendico Brasil <https://www.lendico.com.br>
>
>
> On Sat, 6 Jul 2019 at 13:30, Shubham Gupta <y2...@gmail.com>
> wrote:
>
>> I'm facing precisely same issue.
>> .
>> I've written a LivySessionHook that's just a wrapper over PyLivy Session
>> <https://pylivy.readthedocs.io/en/latest/api/session.html>.
>>
>>    - I'm able to use this hook to send code-snippets to remote EMR via
>>    Python shell a few times, after which it starts throwing "caught
>>    exception 500 Server Error: Internal Server Error for url" (and
>>    continues to do so for next hour or so).
>>    - However when the same hook is triggered via Airflow operator, I get
>>    absolutely no success (always results in 500 error).
>>
>> .
>> I'm using
>>
>>    - Airflow 1.10.3
>>    - Python 3.7.3
>>    - EMR 5.24.1
>>    - Livy 0.6.0
>>    - Spark 2.4.2
>>
>>
>> *Shubham Gupta*
>> Software Engineer
>>  zomato
>>
>>
>> On Sat, Jul 6, 2019 at 6:56 PM Jeff Zhang <zj...@gmail.com> wrote:
>>
>>> For the dead/killed session, could you check the yarn app logs ?
>>>
>>> Hugo Herlanin <hu...@lendico.com.br> 于2019年7月4日周四 下午9:41写道：
>>>
>>>>
>>>> Hey, user mail is not working out!
>>>>
>>>> I am having some problems with livy setup. My use case is as follows: I
>>>> use a DAG in airflow (1.10) to create a cluster in EMR (5.24.1, one master
>>>> is m4.large and two nodes in m5a.xlarge), and when it is ready,  this dag
>>>> sends 5 to 7 simultaneous requests to Livy. I think I'm not messing with
>>>> the Livy settings, I  just set livy.spark.deploy-mode = client and
>>>> livy.repl.enable-hive-context = true.
>>>>
>>>> The problem is that from these ~ 5 to 7 sessions, just one or two opens
>>>> (goes to 'idle') and all others go straight to 'dead' or 'killed', in logs
>>>> Yarn returns that the sessions were killed by 'livy' user. I tried to
>>>> tinker with all possible timeout settings, but this is still happening. If
>>>> I send more than ~10 simultaneous requests, livy responds with 500, and if
>>>> I continue sending requests, the server freezes. This happens even if EMR
>>>> has enough resources available.
>>>>
>>>> I know the cluster is able to handle that many questions because it
>>>> works when I open them via a loop with an interval of 15 seconds or more,
>>>> but it feels like livy should be able to deal with that many requests
>>>> simultaneously. It seems strange that I should need to manage the queue in
>>>> such a way for an API of a distributed system.
>>>>
>>>> Do you have any clue about where I might be doing wrong? Is there any
>>>> known limitation that I'm unaware of?
>>>>
>>>> Best,
>>>>
>>>> Hugo Herlanin
>>>>
>>>>
>>>
>>> --
>>> Best Regards
>>>
>>> Jeff Zhang
>>>
>>

Re: Livy Sessions

Posted by Shubham Gupta <y2...@gmail.com>.

I'm facing precisely same issue.
.
I've written a LivySessionHook that's just a wrapper over PyLivy Session
<https://pylivy.readthedocs.io/en/latest/api/session.html>.

   - I'm able to use this hook to send code-snippets to remote EMR via
   Python shell a few times, after which it starts throwing "caught
   exception 500 Server Error: Internal Server Error for url" (and
   continues to do so for next hour or so).
   - However when the same hook is triggered via Airflow operator, I get
   absolutely no success (always results in 500 error).

.
I'm using

   - Airflow 1.10.3
   - Python 3.7.3
   - EMR 5.24.1
   - Livy 0.6.0
   - Spark 2.4.2


*Shubham Gupta*
Software Engineer
 zomato


On Sat, Jul 6, 2019 at 6:56 PM Jeff Zhang <zj...@gmail.com> wrote:

> For the dead/killed session, could you check the yarn app logs ?
>
> Hugo Herlanin <hu...@lendico.com.br> 于2019年7月4日周四 下午9:41写道：
>
>>
>> Hey, user mail is not working out!
>>
>> I am having some problems with livy setup. My use case is as follows: I
>> use a DAG in airflow (1.10) to create a cluster in EMR (5.24.1, one master
>> is m4.large and two nodes in m5a.xlarge), and when it is ready,  this dag
>> sends 5 to 7 simultaneous requests to Livy. I think I'm not messing with
>> the Livy settings, I  just set livy.spark.deploy-mode = client and
>> livy.repl.enable-hive-context = true.
>>
>> The problem is that from these ~ 5 to 7 sessions, just one or two opens
>> (goes to 'idle') and all others go straight to 'dead' or 'killed', in logs
>> Yarn returns that the sessions were killed by 'livy' user. I tried to
>> tinker with all possible timeout settings, but this is still happening. If
>> I send more than ~10 simultaneous requests, livy responds with 500, and if
>> I continue sending requests, the server freezes. This happens even if EMR
>> has enough resources available.
>>
>> I know the cluster is able to handle that many questions because it works
>> when I open them via a loop with an interval of 15 seconds or more, but it
>> feels like livy should be able to deal with that many requests
>> simultaneously. It seems strange that I should need to manage the queue in
>> such a way for an API of a distributed system.
>>
>> Do you have any clue about where I might be doing wrong? Is there any
>> known limitation that I'm unaware of?
>>
>> Best,
>>
>> Hugo Herlanin
>>
>>
>
> --
> Best Regards
>
> Jeff Zhang
>