You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Victor Monteiro <vi...@ubee.in> on 2017/08/08 17:52:34 UTC

[webserver] Webserver time_starttransfer very high

Hi everyone.

The problem is very straightforward. When doing a request to airflow
webserver, it is taking too much time to send the first byte.

[image: Screen Shot 2017-08-08 at 2.42.55 PM.png]

As you can see in the picture, it took 6 seconds to send the first byte. I
already investigated the connection with the database and it took 36ms to
list all task instances. So, I am starting to think there is a problem
with  airflow webserver or my deployment.

To give you more details about deployment and configurations:

   - *web_server_worker_timeout = 120*
   - *workers = 4*
   - *sql_alchemy_pool_size = 5*
   - *sql_alchemy_pool_recycle = 3600*
   -
*AWS RDS postgres *
   - *AWS m4.large*

Does anyone know what can be causing this problem?

Thank you :D

Re: [webserver] Webserver time_starttransfer very high

Posted by Maxime Beauchemin <ma...@gmail.com>.
The apache mailing list doesn't support sending images, the solution is to
send a link to a hosted image or a reference to JIRA ticket that features
the image.

Max

On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <vi...@ubee.in>
wrote:

> Sorry, I am sending again.
>
> Also, it is always between 6s and 3s.
>
>
> Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
> ash_airflowlist@firemirror.com> escreveu:
>
>> (Your screenshot didn't come through for me, possibly because the list
>> stripped it? That said:)
>>
>> Is it always 6 seconds, or after making a few requests, enough so that
>> each worker stands a chance to have loaded the app any deps does it settle
>> down?
>>
>> i.e. the problem might just be that of warm-up.
>>
>> -ash
>> > On 8 Aug 2017, at 18:52, Victor Monteiro <vi...@ubee.in>
>> wrote:
>> >
>> > Hi everyone.
>> >
>> > The problem is very straightforward. When doing a request to airflow
>> webserver, it is taking too much time to send the first byte.
>> >
>> >
>> >
>> > As you can see in the picture, it took 6 seconds to send the first
>> byte. I already investigated the connection with the database and it took
>> 36ms to list all task instances. So, I am starting to think there is a
>> problem with  airflow webserver or my deployment.
>> >
>> > To give you more details about deployment and configurations:
>> > web_server_worker_timeout = 120
>> > workers = 4
>> > sql_alchemy_pool_size = 5
>> > sql_alchemy_pool_recycle = 3600
>> > AWS RDS postgres
>> > AWS m4.large
>> > Does anyone know what can be causing this problem?
>> >
>> > Thank you :D
>> >
>>
>>

Re: [webserver] Webserver time_starttransfer very high

Posted by Victor Duarte Diniz Monteiro <vi...@inlocomedia.com>.
Hi folks, thank you for all thw raised questions. It enabled us to
investigate this issue. After a while, and testing the webserver locally,
we found that the problem was mostly because we have a dag with 54 tasks
inside it and that was the major problem when requesting a page. To parse
the task, it took 1.5 seconds. Adding this to some connection overhead, for
sure we would get a long waiting time.

Sorry for all this alarm. Our fault. But I am glad we learn that. 😁

On Qua, 9 de ago de 2017 16:43 Bolke de Bruin <bd...@gmail.com> wrote:

> Did you run a tcpdump? Did you test with another web server? Did you try
> putting something like Nginx in front that is just better a doing web
> serving (we do also to add SSL, and we don't see this issue). Is your
> client host name resolvable? (Logging usually reports host names if you
> cannot do a reverse dns lookup it will take time to timeout.
>
>
> Verstuurd vanaf mijn iPad
>
> > Op 9 aug. 2017 om 20:48 heeft Victor Duarte Diniz Monteiro <
> victor.monteiro@inlocomedia.com> het volgende geschreven:
> >
> > Hi Max,
> >
> > we have 3693 task instances in the database.
> > And we have created an index over start_date for table task_instance as
> you
> > suggested, but it is still slow. We don't think the problem is in the
> > database, because when we run AdHoc Queries, they return fast.
> >
> > Em qua, 9 de ago de 2017 às 12:47, Maxime Beauchemin <
> > maximebeauchemin@gmail.com> escreveu:
> >
> >> It seems like the default sort should be on start_date, desc, and yes
> there
> >> should be an index on that.
> >>
> >> Also 100 per page is probably enough.
> >>
> >> Can you try [something like] that in your environment and report of
> loading
> >> times?
> >>
> >> Also for context, how many task instance do you have total?
> >>
> >> Max
> >>
> >> On Tue, Aug 8, 2017 at 12:07 PM, Victor Monteiro <
> victor.monteiro@ubee.in>
> >> wrote:
> >>
> >>> [image: Imagem PNG]
> >>> Screen Shot 2017-08-08 at 2.42.55 PM.png
> >>> <
> https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
> >>> view?usp=drivesdk>
> >>>
> >>> I am sending the image one more time, hosted in google drive.
> >>>
> https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
> >>> view?usp=drivesdk
> >>>
> >>> Edgar, did you find any solution to speed up webserver?
> >>>
> >>> Em ter, 8 de ago de 2017 às 16:04, Edgar Rodriguez
> >>> <ed...@airbnb.com.invalid> escreveu:
> >>>
> >>>> I've been profiling the web UI for the last few days and I think I've
> >>> been
> >>>> able to identify some of the issues. I've seen similar response times
> >>> from
> >>>> the webserver.
> >>>> A couple of things that I found specifically for the task instance
> view
> >>>> are:
> >>>> 1. Page sizes on views are usually too large, and all HTML rendering
> is
> >>>> done server side, flask_admin introduces some latency rendering the
> >>>> templates for 500 TIs at a time in the TaskInstanceModelView, see [
> >>>> AIRFLOW-1483 <https://issues.apache.org/jira/browse/AIRFLOW-1483>]
> >>>> 2. Using unindexed column as default for ordering (required for
> >> paging),
> >>>> triggering a sort on TI requests, e.g. TaskInstanceModelView uses
> >>> `job_id`
> >>>> as default sort column, but there's no index for that, see
> >> [AIRFLOW-1495
> >>>> <https://issues.apache.org/jira/browse/AIRFLOW-1495>]
> >>>>
> >>>> Cheers,
> >>>> Edgar
> >>>>
> >>>> On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <
> >>> victor.monteiro@ubee.in>
> >>>> wrote:
> >>>>
> >>>>> Sorry, I am sending again.
> >>>>>
> >>>>> Also, it is always between 6s and 3s.
> >>>>>
> >>>>>
> >>>>> Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
> >>>>> ash_airflowlist@firemirror.com> escreveu:
> >>>>>
> >>>>>> (Your screenshot didn't come through for me, possibly because the
> >> list
> >>>>>> stripped it? That said:)
> >>>>>>
> >>>>>> Is it always 6 seconds, or after making a few requests, enough so
> >> that
> >>>>>> each worker stands a chance to have loaded the app any deps does it
> >>>> settle
> >>>>>> down?
> >>>>>>
> >>>>>> i.e. the problem might just be that of warm-up.
> >>>>>>
> >>>>>> -ash
> >>>>>>> On 8 Aug 2017, at 18:52, Victor Monteiro <victor.monteiro@ubee.in
> >>>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Hi everyone.
> >>>>>>>
> >>>>>>> The problem is very straightforward. When doing a request to
> >> airflow
> >>>>>> webserver, it is taking too much time to send the first byte.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> As you can see in the picture, it took 6 seconds to send the first
> >>>>>> byte. I already investigated the connection with the database and it
> >>>> took
> >>>>>> 36ms to list all task instances. So, I am starting to think there
> >> is a
> >>>>>> problem with  airflow webserver or my deployment.
> >>>>>>>
> >>>>>>> To give you more details about deployment and configurations:
> >>>>>>> web_server_worker_timeout = 120
> >>>>>>> workers = 4
> >>>>>>> sql_alchemy_pool_size = 5
> >>>>>>> sql_alchemy_pool_recycle = 3600
> >>>>>>> AWS RDS postgres
> >>>>>>> AWS m4.large
> >>>>>>> Does anyone know what can be causing this problem?
> >>>>>>>
> >>>>>>> Thank you :D
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>
> >>>
> >>
>

Re: [webserver] Webserver time_starttransfer very high

Posted by Bolke de Bruin <bd...@gmail.com>.
Did you run a tcpdump? Did you test with another web server? Did you try putting something like Nginx in front that is just better a doing web serving (we do also to add SSL, and we don't see this issue). Is your client host name resolvable? (Logging usually reports host names if you cannot do a reverse dns lookup it will take time to timeout. 


Verstuurd vanaf mijn iPad

> Op 9 aug. 2017 om 20:48 heeft Victor Duarte Diniz Monteiro <vi...@inlocomedia.com> het volgende geschreven:
> 
> Hi Max,
> 
> we have 3693 task instances in the database.
> And we have created an index over start_date for table task_instance as you
> suggested, but it is still slow. We don't think the problem is in the
> database, because when we run AdHoc Queries, they return fast.
> 
> Em qua, 9 de ago de 2017 às 12:47, Maxime Beauchemin <
> maximebeauchemin@gmail.com> escreveu:
> 
>> It seems like the default sort should be on start_date, desc, and yes there
>> should be an index on that.
>> 
>> Also 100 per page is probably enough.
>> 
>> Can you try [something like] that in your environment and report of loading
>> times?
>> 
>> Also for context, how many task instance do you have total?
>> 
>> Max
>> 
>> On Tue, Aug 8, 2017 at 12:07 PM, Victor Monteiro <vi...@ubee.in>
>> wrote:
>> 
>>> [image: Imagem PNG]
>>> Screen Shot 2017-08-08 at 2.42.55 PM.png
>>> <https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
>>> view?usp=drivesdk>
>>> 
>>> I am sending the image one more time, hosted in google drive.
>>> https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
>>> view?usp=drivesdk
>>> 
>>> Edgar, did you find any solution to speed up webserver?
>>> 
>>> Em ter, 8 de ago de 2017 às 16:04, Edgar Rodriguez
>>> <ed...@airbnb.com.invalid> escreveu:
>>> 
>>>> I've been profiling the web UI for the last few days and I think I've
>>> been
>>>> able to identify some of the issues. I've seen similar response times
>>> from
>>>> the webserver.
>>>> A couple of things that I found specifically for the task instance view
>>>> are:
>>>> 1. Page sizes on views are usually too large, and all HTML rendering is
>>>> done server side, flask_admin introduces some latency rendering the
>>>> templates for 500 TIs at a time in the TaskInstanceModelView, see [
>>>> AIRFLOW-1483 <https://issues.apache.org/jira/browse/AIRFLOW-1483>]
>>>> 2. Using unindexed column as default for ordering (required for
>> paging),
>>>> triggering a sort on TI requests, e.g. TaskInstanceModelView uses
>>> `job_id`
>>>> as default sort column, but there's no index for that, see
>> [AIRFLOW-1495
>>>> <https://issues.apache.org/jira/browse/AIRFLOW-1495>]
>>>> 
>>>> Cheers,
>>>> Edgar
>>>> 
>>>> On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <
>>> victor.monteiro@ubee.in>
>>>> wrote:
>>>> 
>>>>> Sorry, I am sending again.
>>>>> 
>>>>> Also, it is always between 6s and 3s.
>>>>> 
>>>>> 
>>>>> Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
>>>>> ash_airflowlist@firemirror.com> escreveu:
>>>>> 
>>>>>> (Your screenshot didn't come through for me, possibly because the
>> list
>>>>>> stripped it? That said:)
>>>>>> 
>>>>>> Is it always 6 seconds, or after making a few requests, enough so
>> that
>>>>>> each worker stands a chance to have loaded the app any deps does it
>>>> settle
>>>>>> down?
>>>>>> 
>>>>>> i.e. the problem might just be that of warm-up.
>>>>>> 
>>>>>> -ash
>>>>>>> On 8 Aug 2017, at 18:52, Victor Monteiro <victor.monteiro@ubee.in
>>> 
>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi everyone.
>>>>>>> 
>>>>>>> The problem is very straightforward. When doing a request to
>> airflow
>>>>>> webserver, it is taking too much time to send the first byte.
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> As you can see in the picture, it took 6 seconds to send the first
>>>>>> byte. I already investigated the connection with the database and it
>>>> took
>>>>>> 36ms to list all task instances. So, I am starting to think there
>> is a
>>>>>> problem with  airflow webserver or my deployment.
>>>>>>> 
>>>>>>> To give you more details about deployment and configurations:
>>>>>>> web_server_worker_timeout = 120
>>>>>>> workers = 4
>>>>>>> sql_alchemy_pool_size = 5
>>>>>>> sql_alchemy_pool_recycle = 3600
>>>>>>> AWS RDS postgres
>>>>>>> AWS m4.large
>>>>>>> Does anyone know what can be causing this problem?
>>>>>>> 
>>>>>>> Thank you :D
>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>> 
>> 

Re: [webserver] Webserver time_starttransfer very high

Posted by Victor Duarte Diniz Monteiro <vi...@inlocomedia.com>.
Hi Max,

we have 3693 task instances in the database.
And we have created an index over start_date for table task_instance as you
suggested, but it is still slow. We don't think the problem is in the
database, because when we run AdHoc Queries, they return fast.

Em qua, 9 de ago de 2017 às 12:47, Maxime Beauchemin <
maximebeauchemin@gmail.com> escreveu:

> It seems like the default sort should be on start_date, desc, and yes there
> should be an index on that.
>
> Also 100 per page is probably enough.
>
> Can you try [something like] that in your environment and report of loading
> times?
>
> Also for context, how many task instance do you have total?
>
> Max
>
> On Tue, Aug 8, 2017 at 12:07 PM, Victor Monteiro <vi...@ubee.in>
> wrote:
>
> > [image: Imagem PNG]
> > Screen Shot 2017-08-08 at 2.42.55 PM.png
> > <https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
> > view?usp=drivesdk>
> >
> > I am sending the image one more time, hosted in google drive.
> > https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
> > view?usp=drivesdk
> >
> > Edgar, did you find any solution to speed up webserver?
> >
> > Em ter, 8 de ago de 2017 às 16:04, Edgar Rodriguez
> > <ed...@airbnb.com.invalid> escreveu:
> >
> > > I've been profiling the web UI for the last few days and I think I've
> > been
> > > able to identify some of the issues. I've seen similar response times
> > from
> > > the webserver.
> > > A couple of things that I found specifically for the task instance view
> > > are:
> > > 1. Page sizes on views are usually too large, and all HTML rendering is
> > > done server side, flask_admin introduces some latency rendering the
> > > templates for 500 TIs at a time in the TaskInstanceModelView, see [
> > > AIRFLOW-1483 <https://issues.apache.org/jira/browse/AIRFLOW-1483>]
> > > 2. Using unindexed column as default for ordering (required for
> paging),
> > > triggering a sort on TI requests, e.g. TaskInstanceModelView uses
> > `job_id`
> > > as default sort column, but there's no index for that, see
> [AIRFLOW-1495
> > > <https://issues.apache.org/jira/browse/AIRFLOW-1495>]
> > >
> > > Cheers,
> > > Edgar
> > >
> > > On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <
> > victor.monteiro@ubee.in>
> > > wrote:
> > >
> > > > Sorry, I am sending again.
> > > >
> > > > Also, it is always between 6s and 3s.
> > > >
> > > >
> > > > Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
> > > > ash_airflowlist@firemirror.com> escreveu:
> > > >
> > > >> (Your screenshot didn't come through for me, possibly because the
> list
> > > >> stripped it? That said:)
> > > >>
> > > >> Is it always 6 seconds, or after making a few requests, enough so
> that
> > > >> each worker stands a chance to have loaded the app any deps does it
> > > settle
> > > >> down?
> > > >>
> > > >> i.e. the problem might just be that of warm-up.
> > > >>
> > > >> -ash
> > > >> > On 8 Aug 2017, at 18:52, Victor Monteiro <victor.monteiro@ubee.in
> >
> > > >> wrote:
> > > >> >
> > > >> > Hi everyone.
> > > >> >
> > > >> > The problem is very straightforward. When doing a request to
> airflow
> > > >> webserver, it is taking too much time to send the first byte.
> > > >> >
> > > >> >
> > > >> >
> > > >> > As you can see in the picture, it took 6 seconds to send the first
> > > >> byte. I already investigated the connection with the database and it
> > > took
> > > >> 36ms to list all task instances. So, I am starting to think there
> is a
> > > >> problem with  airflow webserver or my deployment.
> > > >> >
> > > >> > To give you more details about deployment and configurations:
> > > >> > web_server_worker_timeout = 120
> > > >> > workers = 4
> > > >> > sql_alchemy_pool_size = 5
> > > >> > sql_alchemy_pool_recycle = 3600
> > > >> > AWS RDS postgres
> > > >> > AWS m4.large
> > > >> > Does anyone know what can be causing this problem?
> > > >> >
> > > >> > Thank you :D
> > > >> >
> > > >>
> > > >>
> > >
> >
>

Re: [webserver] Webserver time_starttransfer very high

Posted by Maxime Beauchemin <ma...@gmail.com>.
It seems like the default sort should be on start_date, desc, and yes there
should be an index on that.

Also 100 per page is probably enough.

Can you try [something like] that in your environment and report of loading
times?

Also for context, how many task instance do you have total?

Max

On Tue, Aug 8, 2017 at 12:07 PM, Victor Monteiro <vi...@ubee.in>
wrote:

> [image: Imagem PNG]
> Screen Shot 2017-08-08 at 2.42.55 PM.png
> <https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
> view?usp=drivesdk>
>
> I am sending the image one more time, hosted in google drive.
> https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/
> view?usp=drivesdk
>
> Edgar, did you find any solution to speed up webserver?
>
> Em ter, 8 de ago de 2017 às 16:04, Edgar Rodriguez
> <ed...@airbnb.com.invalid> escreveu:
>
> > I've been profiling the web UI for the last few days and I think I've
> been
> > able to identify some of the issues. I've seen similar response times
> from
> > the webserver.
> > A couple of things that I found specifically for the task instance view
> > are:
> > 1. Page sizes on views are usually too large, and all HTML rendering is
> > done server side, flask_admin introduces some latency rendering the
> > templates for 500 TIs at a time in the TaskInstanceModelView, see [
> > AIRFLOW-1483 <https://issues.apache.org/jira/browse/AIRFLOW-1483>]
> > 2. Using unindexed column as default for ordering (required for paging),
> > triggering a sort on TI requests, e.g. TaskInstanceModelView uses
> `job_id`
> > as default sort column, but there's no index for that, see [AIRFLOW-1495
> > <https://issues.apache.org/jira/browse/AIRFLOW-1495>]
> >
> > Cheers,
> > Edgar
> >
> > On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <
> victor.monteiro@ubee.in>
> > wrote:
> >
> > > Sorry, I am sending again.
> > >
> > > Also, it is always between 6s and 3s.
> > >
> > >
> > > Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
> > > ash_airflowlist@firemirror.com> escreveu:
> > >
> > >> (Your screenshot didn't come through for me, possibly because the list
> > >> stripped it? That said:)
> > >>
> > >> Is it always 6 seconds, or after making a few requests, enough so that
> > >> each worker stands a chance to have loaded the app any deps does it
> > settle
> > >> down?
> > >>
> > >> i.e. the problem might just be that of warm-up.
> > >>
> > >> -ash
> > >> > On 8 Aug 2017, at 18:52, Victor Monteiro <vi...@ubee.in>
> > >> wrote:
> > >> >
> > >> > Hi everyone.
> > >> >
> > >> > The problem is very straightforward. When doing a request to airflow
> > >> webserver, it is taking too much time to send the first byte.
> > >> >
> > >> >
> > >> >
> > >> > As you can see in the picture, it took 6 seconds to send the first
> > >> byte. I already investigated the connection with the database and it
> > took
> > >> 36ms to list all task instances. So, I am starting to think there is a
> > >> problem with  airflow webserver or my deployment.
> > >> >
> > >> > To give you more details about deployment and configurations:
> > >> > web_server_worker_timeout = 120
> > >> > workers = 4
> > >> > sql_alchemy_pool_size = 5
> > >> > sql_alchemy_pool_recycle = 3600
> > >> > AWS RDS postgres
> > >> > AWS m4.large
> > >> > Does anyone know what can be causing this problem?
> > >> >
> > >> > Thank you :D
> > >> >
> > >>
> > >>
> >
>

Re: [webserver] Webserver time_starttransfer very high

Posted by Victor Monteiro <vi...@ubee.in>.
[image: Imagem PNG]
Screen Shot 2017-08-08 at 2.42.55 PM.png
<https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/view?usp=drivesdk>

I am sending the image one more time, hosted in google drive.
https://drive.google.com/a/ubee.in/file/d/0B7u1tjyaPWJQeVVtbFIzcS11eWc/view?usp=drivesdk

Edgar, did you find any solution to speed up webserver?

Em ter, 8 de ago de 2017 às 16:04, Edgar Rodriguez
<ed...@airbnb.com.invalid> escreveu:

> I've been profiling the web UI for the last few days and I think I've been
> able to identify some of the issues. I've seen similar response times from
> the webserver.
> A couple of things that I found specifically for the task instance view
> are:
> 1. Page sizes on views are usually too large, and all HTML rendering is
> done server side, flask_admin introduces some latency rendering the
> templates for 500 TIs at a time in the TaskInstanceModelView, see [
> AIRFLOW-1483 <https://issues.apache.org/jira/browse/AIRFLOW-1483>]
> 2. Using unindexed column as default for ordering (required for paging),
> triggering a sort on TI requests, e.g. TaskInstanceModelView uses `job_id`
> as default sort column, but there's no index for that, see [AIRFLOW-1495
> <https://issues.apache.org/jira/browse/AIRFLOW-1495>]
>
> Cheers,
> Edgar
>
> On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <vi...@ubee.in>
> wrote:
>
> > Sorry, I am sending again.
> >
> > Also, it is always between 6s and 3s.
> >
> >
> > Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
> > ash_airflowlist@firemirror.com> escreveu:
> >
> >> (Your screenshot didn't come through for me, possibly because the list
> >> stripped it? That said:)
> >>
> >> Is it always 6 seconds, or after making a few requests, enough so that
> >> each worker stands a chance to have loaded the app any deps does it
> settle
> >> down?
> >>
> >> i.e. the problem might just be that of warm-up.
> >>
> >> -ash
> >> > On 8 Aug 2017, at 18:52, Victor Monteiro <vi...@ubee.in>
> >> wrote:
> >> >
> >> > Hi everyone.
> >> >
> >> > The problem is very straightforward. When doing a request to airflow
> >> webserver, it is taking too much time to send the first byte.
> >> >
> >> >
> >> >
> >> > As you can see in the picture, it took 6 seconds to send the first
> >> byte. I already investigated the connection with the database and it
> took
> >> 36ms to list all task instances. So, I am starting to think there is a
> >> problem with  airflow webserver or my deployment.
> >> >
> >> > To give you more details about deployment and configurations:
> >> > web_server_worker_timeout = 120
> >> > workers = 4
> >> > sql_alchemy_pool_size = 5
> >> > sql_alchemy_pool_recycle = 3600
> >> > AWS RDS postgres
> >> > AWS m4.large
> >> > Does anyone know what can be causing this problem?
> >> >
> >> > Thank you :D
> >> >
> >>
> >>
>

Re: [webserver] Webserver time_starttransfer very high

Posted by Edgar Rodriguez <ed...@airbnb.com.INVALID>.
I've been profiling the web UI for the last few days and I think I've been
able to identify some of the issues. I've seen similar response times from
the webserver.
A couple of things that I found specifically for the task instance view are:
1. Page sizes on views are usually too large, and all HTML rendering is
done server side, flask_admin introduces some latency rendering the
templates for 500 TIs at a time in the TaskInstanceModelView, see [
AIRFLOW-1483 <https://issues.apache.org/jira/browse/AIRFLOW-1483>]
2. Using unindexed column as default for ordering (required for paging),
triggering a sort on TI requests, e.g. TaskInstanceModelView uses `job_id`
as default sort column, but there's no index for that, see [AIRFLOW-1495
<https://issues.apache.org/jira/browse/AIRFLOW-1495>]

Cheers,
Edgar

On Tue, Aug 8, 2017 at 11:56 AM, Victor Monteiro <vi...@ubee.in>
wrote:

> Sorry, I am sending again.
>
> Also, it is always between 6s and 3s.
>
>
> Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
> ash_airflowlist@firemirror.com> escreveu:
>
>> (Your screenshot didn't come through for me, possibly because the list
>> stripped it? That said:)
>>
>> Is it always 6 seconds, or after making a few requests, enough so that
>> each worker stands a chance to have loaded the app any deps does it settle
>> down?
>>
>> i.e. the problem might just be that of warm-up.
>>
>> -ash
>> > On 8 Aug 2017, at 18:52, Victor Monteiro <vi...@ubee.in>
>> wrote:
>> >
>> > Hi everyone.
>> >
>> > The problem is very straightforward. When doing a request to airflow
>> webserver, it is taking too much time to send the first byte.
>> >
>> >
>> >
>> > As you can see in the picture, it took 6 seconds to send the first
>> byte. I already investigated the connection with the database and it took
>> 36ms to list all task instances. So, I am starting to think there is a
>> problem with  airflow webserver or my deployment.
>> >
>> > To give you more details about deployment and configurations:
>> > web_server_worker_timeout = 120
>> > workers = 4
>> > sql_alchemy_pool_size = 5
>> > sql_alchemy_pool_recycle = 3600
>> > AWS RDS postgres
>> > AWS m4.large
>> > Does anyone know what can be causing this problem?
>> >
>> > Thank you :D
>> >
>>
>>

Re: [webserver] Webserver time_starttransfer very high

Posted by Victor Monteiro <vi...@ubee.in>.
Sorry, I am sending again.

Also, it is always between 6s and 3s.

Em ter, 8 de ago de 2017 às 15:21, Ash Berlin-Taylor <
ash_airflowlist@firemirror.com> escreveu:

> (Your screenshot didn't come through for me, possibly because the list
> stripped it? That said:)
>
> Is it always 6 seconds, or after making a few requests, enough so that
> each worker stands a chance to have loaded the app any deps does it settle
> down?
>
> i.e. the problem might just be that of warm-up.
>
> -ash
> > On 8 Aug 2017, at 18:52, Victor Monteiro <vi...@ubee.in>
> wrote:
> >
> > Hi everyone.
> >
> > The problem is very straightforward. When doing a request to airflow
> webserver, it is taking too much time to send the first byte.
> >
> >
> >
> > As you can see in the picture, it took 6 seconds to send the first byte.
> I already investigated the connection with the database and it took 36ms to
> list all task instances. So, I am starting to think there is a problem
> with  airflow webserver or my deployment.
> >
> > To give you more details about deployment and configurations:
> > web_server_worker_timeout = 120
> > workers = 4
> > sql_alchemy_pool_size = 5
> > sql_alchemy_pool_recycle = 3600
> > AWS RDS postgres
> > AWS m4.large
> > Does anyone know what can be causing this problem?
> >
> > Thank you :D
> >
>
>

Re: [webserver] Webserver time_starttransfer very high

Posted by Ash Berlin-Taylor <as...@firemirror.com>.
(Your screenshot didn't come through for me, possibly because the list stripped it? That said:)

Is it always 6 seconds, or after making a few requests, enough so that each worker stands a chance to have loaded the app any deps does it settle down?

i.e. the problem might just be that of warm-up.

-ash
> On 8 Aug 2017, at 18:52, Victor Monteiro <vi...@ubee.in> wrote:
> 
> Hi everyone.
> 
> The problem is very straightforward. When doing a request to airflow webserver, it is taking too much time to send the first byte.
> 
> 
> 
> As you can see in the picture, it took 6 seconds to send the first byte. I already investigated the connection with the database and it took 36ms to list all task instances. So, I am starting to think there is a problem with  airflow webserver or my deployment.
> 
> To give you more details about deployment and configurations:
> web_server_worker_timeout = 120
> workers = 4
> sql_alchemy_pool_size = 5
> sql_alchemy_pool_recycle = 3600
> AWS RDS postgres
> AWS m4.large
> Does anyone know what can be causing this problem?
> 
> Thank you :D
>