You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Nicholas Hodgkinson <ni...@collectivehealth.com> on 2017/08/21 18:16:37 UTC

Fixing logging for Airflow inside Docker

All,

I've got a problem I'm trying to solve where I've expanded my Airflow
machine into a cluster; this itself works, however I'm having a problem
with the accessing the logs through the UI. I know this is due to the fact
that I run my airflow workers (and other processes) inside docker
containers. I get this error when trying to access logs:

*** Log file isn't local.
*** Fetching here:
http://f6400f7aea88:8793/log/cjob/queue/2017-08-18T22:44:09.334353
*** Failed to fetch log file from worker.

Now I understand that "f6400f7aea88" is the address of the docker container
within docker, however this is not running on the same machine as the
webserver so this address can not be resolved. So my question is: how I can
change either the address that the web UI uses or the address that the
worker reports back?

Thanks,
-Nik
nik.hodgkinson@collectivehealth.com

-- 

         See why we are one of WSJ's Top Tech Companies to Watch 
<https://www.wsj.com/articles/collective-healths-goal-medical-benefits-that-work-1497492300?mg=prod/accounts-wsj>

*This message may contain confidential, proprietary, or protected 
information.  If you are not the intended recipient, you may not review, 
copy, or distribute this message. If you received this message in error, 
please notify the sender by reply email and delete this message.*

Re: Fixing logging for Airflow inside Docker

Posted by Nicholas Hodgkinson <ni...@collectivehealth.com>.
Yongjun,

Thank you that is exactly what I was looking for.

Nolan,

Thanks for your response, not exactly what I was looking for, but I
appreciate you taking the time to respond.

-N
nik.hodgkinson@collectivehealth.com

On Tue, Aug 22, 2017 at 7:40 AM, Yongjun Park <th...@gmail.com>
wrote:

> Hi Nicholas.
>
> I've got the same situation. I have dockerized airflow workers are running
> on different EC2s.
> To resolve this issue, I've set hostname of the docker container as the ip
> address of the EC2.
>
> If you are using docker compose, you can add *hostname* field to the yaml
> file. Otherwise,
> use *-h* option to set hostname.
>
> Thanks,
> Yongjun
>
>
> 2017-08-22 3:16 GMT+09:00 Nicholas Hodgkinson <
> nik.hodgkinson@collectivehealth.com>:
>
> > All,
> >
> > I've got a problem I'm trying to solve where I've expanded my Airflow
> > machine into a cluster; this itself works, however I'm having a problem
> > with the accessing the logs through the UI. I know this is due to the
> fact
> > that I run my airflow workers (and other processes) inside docker
> > containers. I get this error when trying to access logs:
> >
> > *** Log file isn't local.
> > *** Fetching here:
> > http://f6400f7aea88:8793/log/cjob/queue/2017-08-18T22:44:09.334353
> > *** Failed to fetch log file from worker.
> >
> > Now I understand that "f6400f7aea88" is the address of the docker
> container
> > within docker, however this is not running on the same machine as the
> > webserver so this address can not be resolved. So my question is: how I
> can
> > change either the address that the web UI uses or the address that the
> > worker reports back?
> >
> > Thanks,
> > -Nik
> > nik.hodgkinson@collectivehealth.com
> >
> > --
> >
> >          See why we are one of WSJ's Top Tech Companies to Watch
> > <https://www.wsj.com/articles/collective-healths-goal-
> > medical-benefits-that-work-1497492300?mg=prod/accounts-wsj>
> >
> > *This message may contain confidential, proprietary, or protected
> > information.  If you are not the intended recipient, you may not review,
> > copy, or distribute this message. If you received this message in error,
> > please notify the sender by reply email and delete this message.*
> >
>

-- 

         See why we are one of WSJ's Top Tech Companies to Watch 
<https://www.wsj.com/articles/collective-healths-goal-medical-benefits-that-work-1497492300?mg=prod/accounts-wsj>

*This message may contain confidential, proprietary, or protected 
information.  If you are not the intended recipient, you may not review, 
copy, or distribute this message. If you received this message in error, 
please notify the sender by reply email and delete this message.*

Re: Fixing logging for Airflow inside Docker

Posted by Yongjun Park <th...@gmail.com>.
Hi Nicholas.

I've got the same situation. I have dockerized airflow workers are running
on different EC2s.
To resolve this issue, I've set hostname of the docker container as the ip
address of the EC2.

If you are using docker compose, you can add *hostname* field to the yaml
file. Otherwise,
use *-h* option to set hostname.

Thanks,
Yongjun


2017-08-22 3:16 GMT+09:00 Nicholas Hodgkinson <
nik.hodgkinson@collectivehealth.com>:

> All,
>
> I've got a problem I'm trying to solve where I've expanded my Airflow
> machine into a cluster; this itself works, however I'm having a problem
> with the accessing the logs through the UI. I know this is due to the fact
> that I run my airflow workers (and other processes) inside docker
> containers. I get this error when trying to access logs:
>
> *** Log file isn't local.
> *** Fetching here:
> http://f6400f7aea88:8793/log/cjob/queue/2017-08-18T22:44:09.334353
> *** Failed to fetch log file from worker.
>
> Now I understand that "f6400f7aea88" is the address of the docker container
> within docker, however this is not running on the same machine as the
> webserver so this address can not be resolved. So my question is: how I can
> change either the address that the web UI uses or the address that the
> worker reports back?
>
> Thanks,
> -Nik
> nik.hodgkinson@collectivehealth.com
>
> --
>
>          See why we are one of WSJ's Top Tech Companies to Watch
> <https://www.wsj.com/articles/collective-healths-goal-
> medical-benefits-that-work-1497492300?mg=prod/accounts-wsj>
>
> *This message may contain confidential, proprietary, or protected
> information.  If you are not the intended recipient, you may not review,
> copy, or distribute this message. If you received this message in error,
> please notify the sender by reply email and delete this message.*
>

Re: Fixing logging for Airflow inside Docker

Posted by Nolan Emirot <no...@turo.com>.
Hi Nicholas

Airflow will first try to find logs locally if none are found, it will try
to find them in s3 or gcs.
The only drawback is that you won't see the logs of tasks that are
currently running if you only push logs in S3. see documentation
<https://airflow.incubator.apache.org/configuration.html#logs>

It's hard to give you a good answer did you use this image :
https://github.com/puckel ?

Here is two workarounds :

- If you take this approach using S3, I highly recommend this post
<https://stackoverflow.com/questions/39997714/airflow-s3-connection-using-ui>
the configuration is a little bit tricky.
- You can also mount a volume  on /usr/airflow/logs so the logs will be
shared across your instances.

For the volume I will choose an elastic file system because I don't know if
logs are trimmed or rotated in airflow see:
https://stackoverflow.com/questions/43548744/removing-airflow-task-logs


Best,

Nolan


On Mon, Aug 21, 2017 at 11:16 AM, Nicholas Hodgkinson <
nik.hodgkinson@collectivehealth.com> wrote:

> All,
>
> I've got a problem I'm trying to solve where I've expanded my Airflow
> machine into a cluster; this itself works, however I'm having a problem
> with the accessing the logs through the UI. I know this is due to the fact
> that I run my airflow workers (and other processes) inside docker
> containers. I get this error when trying to access logs:
>
> *** Log file isn't local.
> *** Fetching here:
> http://f6400f7aea88:8793/log/cjob/queue/2017-08-18T22:44:09.334353
> *** Failed to fetch log file from worker.
>
> Now I understand that "f6400f7aea88" is the address of the docker container
> within docker, however this is not running on the same machine as the
> webserver so this address can not be resolved. So my question is: how I can
> change either the address that the web UI uses or the address that the
> worker reports back?
>
> Thanks,
> -Nik
> nik.hodgkinson@collectivehealth.com
>
> --
>
>          See why we are one of WSJ's Top Tech Companies to Watch
> <https://www.wsj.com/articles/collective-healths-goal-
> medical-benefits-that-work-1497492300?mg=prod/accounts-wsj>
>
> *This message may contain confidential, proprietary, or protected
> information.  If you are not the intended recipient, you may not review,
> copy, or distribute this message. If you received this message in error,
> please notify the sender by reply email and delete this message.*
>