You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by David Klosowski <da...@thinknear.com> on 2017/10/06 23:11:20 UTC

Using s3 logging in Airflow 1.9.x

Hey Airflow Devs:

How is s3 logging supposed to work in Airflow 1.9.0?

I've followed the *UPDATING.md* guide for the new setup of logging and
while I can use my custom logging configuration module to format the files
written to the host, the s3 logging doesn't appear to work as I don't see
anything in s3.

*> airflow.cfg*

task_log_reader = s3.task


*> custom logging module added to PYTHONPATH with __init__.py in directory*
----
import os

from airflow import configuration as conf

LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
LOG_FORMAT = conf.get('core', 'log_format')

BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')

FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
try_number }}.log'

STAGE = os.getenv('STAGE')

S3_LOG_FOLDER = 's3://tn-testing-bucket/dk

LOGGING_CONFIG = {
    'version': 1,
    'disable_existing_loggers': False,
    'formatters': {
        'airflow.task': {
            'format': LOG_FORMAT,
        },
    },
    'handlers': {
        'console': {
            'class': 'logging.StreamHandler',
            'formatter': 'airflow.task',
            'stream': 'ext://sys.stdout'
        },
        'file.task': {
            'class': 'airflow.utils.log.file_task_handler.FileTaskHandler',
            'formatter': 'airflow.task',
            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
            'filename_template': FILENAME_TEMPLATE,
        },
        # When using s3 or gcs, provide a customized LOGGING_CONFIG
        # in airflow_local_settings within your PYTHONPATH, see UPDATING.md
        # for details
        's3.task': {
            'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
            'formatter': 'airflow.task',
            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
            's3_log_folder': S3_LOG_FOLDER,
            'filename_template': FILENAME_TEMPLATE,
        }
    },
    'loggers': {
        'airflow.task': {
            'handlers': ['s3.task'],
            'level': LOG_LEVEL,
            'propagate': False,
        },
        'airflow.task_runner': {
            'handlers': ['s3.task'],
            'level': LOG_LEVEL,
            'propagate': True,
        },
        'airflow': {
            'handlers': ['console'],
            'level': LOG_LEVEL,
            'propagate': False,
        },
    }
}
-----

I never see any task logs in S3, even after completion of all tasks.

While running this in docker since I my executors/workers are on different
hosts, when I try to pull up the task logs in the UI I receive the
following error b/c they there in s3:

Failed to fetch log file from worker.
HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
exceeded with url:



Any additional hints you can provide on what else needs to be done?

Thanks.

Regards,
David

Re: Using s3 logging in Airflow 1.9.x

Posted by David Klosowski <da...@thinknear.com>.
Ok thanks Chris.  I'll pull from the test branch and not the alpha tag and
see how that goes.

Cheers,
David

On Mon, Oct 9, 2017 at 10:41 AM, Chris Riccomini <cr...@apache.org>
wrote:

> Also, we just pulled a bug fix into v1-9-test last week that was causing
> S3TaskHandler not to upload anything.
>
> On Mon, Oct 9, 2017 at 10:39 AM, Chris Riccomini <cr...@apache.org>
> wrote:
>
> > Have a look at this:
> >
> > https://github.com/apache/incubator-airflow/pull/2671
> >
> > I had to do a similar dance.
> >
> >
> > On Mon, Oct 9, 2017 at 10:28 AM, David Klosowski <da...@thinknear.com>
> > wrote:
> >
> >> Out of curiosity, has anyone had the S3 logging feature work that has
> >> tested it out in the latest 1.9 branch of Airflow?
> >>
> >> I'm imagining that anyone with a distributed container environment with
> >> Airflow will want to be able to view the task logs in the UI, and w/o
> this
> >> feature it won't work as you can't share the task logs at the host level
> >> when the containers are distributed.
> >>
> >> Thanks.
> >>
> >> Regards,
> >> David
> >>
> >> On Sat, Oct 7, 2017 at 10:06 AM, David Klosowski <da...@thinknear.com>
> >> wrote:
> >>
> >> > Hi Ash,
> >> >
> >> > Thanks for the response .
> >> >
> >> > I neglected to post that I do in fact have that config specified in
> the
> >> > airflow.cfg
> >> >
> >> > My file logger shows the custom formatting I have set and I see a log
> >> line
> >> > in the console from the scheduler:
> >> >
> >> > {logging_config.py:42} INFO - Successfully imported user-defined
> logging
> >> > config from
> >> >
> >> > Cheers,
> >> > David
> >> >
> >> >
> >> > On Sat, Oct 7, 2017 at 7:42 AM, Ash Berlin-Taylor <
> >> > ash_airflowlist@firemirror.com> wrote:
> >> >
> >> >> It could be that you have created a custom logging file, but you
> >> haven't
> >> >> specified it in your airflow.cfg:
> >> >>
> >> >> ```
> >> >> logging_config_class=mymodule.LOGGING_CONFIG
> >> >> ```
> >> >>
> >> >>
> >> >>
> >> >>
> >> >> > On 7 Oct 2017, at 00:11, David Klosowski <da...@thinknear.com>
> >> wrote:
> >> >> >
> >> >> > Hey Airflow Devs:
> >> >> >
> >> >> > How is s3 logging supposed to work in Airflow 1.9.0?
> >> >> >
> >> >> > I've followed the *UPDATING.md* guide for the new setup of logging
> >> and
> >> >> > while I can use my custom logging configuration module to format
> the
> >> >> files
> >> >> > written to the host, the s3 logging doesn't appear to work as I
> don't
> >> >> see
> >> >> > anything in s3.
> >> >> >
> >> >> > *> airflow.cfg*
> >> >> >
> >> >> > task_log_reader = s3.task
> >> >> >
> >> >> >
> >> >> > *> custom logging module added to PYTHONPATH with __init__.py in
> >> >> directory*
> >> >> > ----
> >> >> > import os
> >> >> >
> >> >> > from airflow import configuration as conf
> >> >> >
> >> >> > LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
> >> >> > LOG_FORMAT = conf.get('core', 'log_format')
> >> >> >
> >> >> > BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
> >> >> >
> >> >> > FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
> >> >> > try_number }}.log'
> >> >> >
> >> >> > STAGE = os.getenv('STAGE')
> >> >> >
> >> >> > S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
> >> >> >
> >> >> > LOGGING_CONFIG = {
> >> >> >    'version': 1,
> >> >> >    'disable_existing_loggers': False,
> >> >> >    'formatters': {
> >> >> >        'airflow.task': {
> >> >> >            'format': LOG_FORMAT,
> >> >> >        },
> >> >> >    },
> >> >> >    'handlers': {
> >> >> >        'console': {
> >> >> >            'class': 'logging.StreamHandler',
> >> >> >            'formatter': 'airflow.task',
> >> >> >            'stream': 'ext://sys.stdout'
> >> >> >        },
> >> >> >        'file.task': {
> >> >> >            'class': 'airflow.utils.log.file_task_h
> >> >> andler.FileTaskHandler',
> >> >> >            'formatter': 'airflow.task',
> >> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >> >> >            'filename_template': FILENAME_TEMPLATE,
> >> >> >        },
> >> >> >        # When using s3 or gcs, provide a customized LOGGING_CONFIG
> >> >> >        # in airflow_local_settings within your PYTHONPATH, see
> >> >> UPDATING.md
> >> >> >        # for details
> >> >> >        's3.task': {
> >> >> >            'class': 'airflow.utils.log.s3_task_han
> >> dler.S3TaskHandler',
> >> >> >            'formatter': 'airflow.task',
> >> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >> >> >            's3_log_folder': S3_LOG_FOLDER,
> >> >> >            'filename_template': FILENAME_TEMPLATE,
> >> >> >        }
> >> >> >    },
> >> >> >    'loggers': {
> >> >> >        'airflow.task': {
> >> >> >            'handlers': ['s3.task'],
> >> >> >            'level': LOG_LEVEL,
> >> >> >            'propagate': False,
> >> >> >        },
> >> >> >        'airflow.task_runner': {
> >> >> >            'handlers': ['s3.task'],
> >> >> >            'level': LOG_LEVEL,
> >> >> >            'propagate': True,
> >> >> >        },
> >> >> >        'airflow': {
> >> >> >            'handlers': ['console'],
> >> >> >            'level': LOG_LEVEL,
> >> >> >            'propagate': False,
> >> >> >        },
> >> >> >    }
> >> >> > }
> >> >> > -----
> >> >> >
> >> >> > I never see any task logs in S3, even after completion of all
> tasks.
> >> >> >
> >> >> > While running this in docker since I my executors/workers are on
> >> >> different
> >> >> > hosts, when I try to pull up the task logs in the UI I receive the
> >> >> > following error b/c they there in s3:
> >> >> >
> >> >> > Failed to fetch log file from worker.
> >> >> > HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
> >> >> > exceeded with url:
> >> >> >
> >> >> >
> >> >> >
> >> >> > Any additional hints you can provide on what else needs to be done?
> >> >> >
> >> >> > Thanks.
> >> >> >
> >> >> > Regards,
> >> >> > David
> >> >>
> >> >>
> >> >
> >>
> >
> >
>

Re: Using s3 logging in Airflow 1.9.x

Posted by Chris Riccomini <cr...@apache.org>.
Also, we just pulled a bug fix into v1-9-test last week that was causing
S3TaskHandler not to upload anything.

On Mon, Oct 9, 2017 at 10:39 AM, Chris Riccomini <cr...@apache.org>
wrote:

> Have a look at this:
>
> https://github.com/apache/incubator-airflow/pull/2671
>
> I had to do a similar dance.
>
>
> On Mon, Oct 9, 2017 at 10:28 AM, David Klosowski <da...@thinknear.com>
> wrote:
>
>> Out of curiosity, has anyone had the S3 logging feature work that has
>> tested it out in the latest 1.9 branch of Airflow?
>>
>> I'm imagining that anyone with a distributed container environment with
>> Airflow will want to be able to view the task logs in the UI, and w/o this
>> feature it won't work as you can't share the task logs at the host level
>> when the containers are distributed.
>>
>> Thanks.
>>
>> Regards,
>> David
>>
>> On Sat, Oct 7, 2017 at 10:06 AM, David Klosowski <da...@thinknear.com>
>> wrote:
>>
>> > Hi Ash,
>> >
>> > Thanks for the response .
>> >
>> > I neglected to post that I do in fact have that config specified in the
>> > airflow.cfg
>> >
>> > My file logger shows the custom formatting I have set and I see a log
>> line
>> > in the console from the scheduler:
>> >
>> > {logging_config.py:42} INFO - Successfully imported user-defined logging
>> > config from
>> >
>> > Cheers,
>> > David
>> >
>> >
>> > On Sat, Oct 7, 2017 at 7:42 AM, Ash Berlin-Taylor <
>> > ash_airflowlist@firemirror.com> wrote:
>> >
>> >> It could be that you have created a custom logging file, but you
>> haven't
>> >> specified it in your airflow.cfg:
>> >>
>> >> ```
>> >> logging_config_class=mymodule.LOGGING_CONFIG
>> >> ```
>> >>
>> >>
>> >>
>> >>
>> >> > On 7 Oct 2017, at 00:11, David Klosowski <da...@thinknear.com>
>> wrote:
>> >> >
>> >> > Hey Airflow Devs:
>> >> >
>> >> > How is s3 logging supposed to work in Airflow 1.9.0?
>> >> >
>> >> > I've followed the *UPDATING.md* guide for the new setup of logging
>> and
>> >> > while I can use my custom logging configuration module to format the
>> >> files
>> >> > written to the host, the s3 logging doesn't appear to work as I don't
>> >> see
>> >> > anything in s3.
>> >> >
>> >> > *> airflow.cfg*
>> >> >
>> >> > task_log_reader = s3.task
>> >> >
>> >> >
>> >> > *> custom logging module added to PYTHONPATH with __init__.py in
>> >> directory*
>> >> > ----
>> >> > import os
>> >> >
>> >> > from airflow import configuration as conf
>> >> >
>> >> > LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
>> >> > LOG_FORMAT = conf.get('core', 'log_format')
>> >> >
>> >> > BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
>> >> >
>> >> > FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
>> >> > try_number }}.log'
>> >> >
>> >> > STAGE = os.getenv('STAGE')
>> >> >
>> >> > S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
>> >> >
>> >> > LOGGING_CONFIG = {
>> >> >    'version': 1,
>> >> >    'disable_existing_loggers': False,
>> >> >    'formatters': {
>> >> >        'airflow.task': {
>> >> >            'format': LOG_FORMAT,
>> >> >        },
>> >> >    },
>> >> >    'handlers': {
>> >> >        'console': {
>> >> >            'class': 'logging.StreamHandler',
>> >> >            'formatter': 'airflow.task',
>> >> >            'stream': 'ext://sys.stdout'
>> >> >        },
>> >> >        'file.task': {
>> >> >            'class': 'airflow.utils.log.file_task_h
>> >> andler.FileTaskHandler',
>> >> >            'formatter': 'airflow.task',
>> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>> >> >            'filename_template': FILENAME_TEMPLATE,
>> >> >        },
>> >> >        # When using s3 or gcs, provide a customized LOGGING_CONFIG
>> >> >        # in airflow_local_settings within your PYTHONPATH, see
>> >> UPDATING.md
>> >> >        # for details
>> >> >        's3.task': {
>> >> >            'class': 'airflow.utils.log.s3_task_han
>> dler.S3TaskHandler',
>> >> >            'formatter': 'airflow.task',
>> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>> >> >            's3_log_folder': S3_LOG_FOLDER,
>> >> >            'filename_template': FILENAME_TEMPLATE,
>> >> >        }
>> >> >    },
>> >> >    'loggers': {
>> >> >        'airflow.task': {
>> >> >            'handlers': ['s3.task'],
>> >> >            'level': LOG_LEVEL,
>> >> >            'propagate': False,
>> >> >        },
>> >> >        'airflow.task_runner': {
>> >> >            'handlers': ['s3.task'],
>> >> >            'level': LOG_LEVEL,
>> >> >            'propagate': True,
>> >> >        },
>> >> >        'airflow': {
>> >> >            'handlers': ['console'],
>> >> >            'level': LOG_LEVEL,
>> >> >            'propagate': False,
>> >> >        },
>> >> >    }
>> >> > }
>> >> > -----
>> >> >
>> >> > I never see any task logs in S3, even after completion of all tasks.
>> >> >
>> >> > While running this in docker since I my executors/workers are on
>> >> different
>> >> > hosts, when I try to pull up the task logs in the UI I receive the
>> >> > following error b/c they there in s3:
>> >> >
>> >> > Failed to fetch log file from worker.
>> >> > HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
>> >> > exceeded with url:
>> >> >
>> >> >
>> >> >
>> >> > Any additional hints you can provide on what else needs to be done?
>> >> >
>> >> > Thanks.
>> >> >
>> >> > Regards,
>> >> > David
>> >>
>> >>
>> >
>>
>
>

Re: Using s3 logging in Airflow 1.9.x

Posted by Chris Riccomini <cr...@apache.org>.
Have a look at this:

https://github.com/apache/incubator-airflow/pull/2671

I had to do a similar dance.


On Mon, Oct 9, 2017 at 10:28 AM, David Klosowski <da...@thinknear.com>
wrote:

> Out of curiosity, has anyone had the S3 logging feature work that has
> tested it out in the latest 1.9 branch of Airflow?
>
> I'm imagining that anyone with a distributed container environment with
> Airflow will want to be able to view the task logs in the UI, and w/o this
> feature it won't work as you can't share the task logs at the host level
> when the containers are distributed.
>
> Thanks.
>
> Regards,
> David
>
> On Sat, Oct 7, 2017 at 10:06 AM, David Klosowski <da...@thinknear.com>
> wrote:
>
> > Hi Ash,
> >
> > Thanks for the response .
> >
> > I neglected to post that I do in fact have that config specified in the
> > airflow.cfg
> >
> > My file logger shows the custom formatting I have set and I see a log
> line
> > in the console from the scheduler:
> >
> > {logging_config.py:42} INFO - Successfully imported user-defined logging
> > config from
> >
> > Cheers,
> > David
> >
> >
> > On Sat, Oct 7, 2017 at 7:42 AM, Ash Berlin-Taylor <
> > ash_airflowlist@firemirror.com> wrote:
> >
> >> It could be that you have created a custom logging file, but you haven't
> >> specified it in your airflow.cfg:
> >>
> >> ```
> >> logging_config_class=mymodule.LOGGING_CONFIG
> >> ```
> >>
> >>
> >>
> >>
> >> > On 7 Oct 2017, at 00:11, David Klosowski <da...@thinknear.com>
> wrote:
> >> >
> >> > Hey Airflow Devs:
> >> >
> >> > How is s3 logging supposed to work in Airflow 1.9.0?
> >> >
> >> > I've followed the *UPDATING.md* guide for the new setup of logging and
> >> > while I can use my custom logging configuration module to format the
> >> files
> >> > written to the host, the s3 logging doesn't appear to work as I don't
> >> see
> >> > anything in s3.
> >> >
> >> > *> airflow.cfg*
> >> >
> >> > task_log_reader = s3.task
> >> >
> >> >
> >> > *> custom logging module added to PYTHONPATH with __init__.py in
> >> directory*
> >> > ----
> >> > import os
> >> >
> >> > from airflow import configuration as conf
> >> >
> >> > LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
> >> > LOG_FORMAT = conf.get('core', 'log_format')
> >> >
> >> > BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
> >> >
> >> > FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
> >> > try_number }}.log'
> >> >
> >> > STAGE = os.getenv('STAGE')
> >> >
> >> > S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
> >> >
> >> > LOGGING_CONFIG = {
> >> >    'version': 1,
> >> >    'disable_existing_loggers': False,
> >> >    'formatters': {
> >> >        'airflow.task': {
> >> >            'format': LOG_FORMAT,
> >> >        },
> >> >    },
> >> >    'handlers': {
> >> >        'console': {
> >> >            'class': 'logging.StreamHandler',
> >> >            'formatter': 'airflow.task',
> >> >            'stream': 'ext://sys.stdout'
> >> >        },
> >> >        'file.task': {
> >> >            'class': 'airflow.utils.log.file_task_h
> >> andler.FileTaskHandler',
> >> >            'formatter': 'airflow.task',
> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >> >            'filename_template': FILENAME_TEMPLATE,
> >> >        },
> >> >        # When using s3 or gcs, provide a customized LOGGING_CONFIG
> >> >        # in airflow_local_settings within your PYTHONPATH, see
> >> UPDATING.md
> >> >        # for details
> >> >        's3.task': {
> >> >            'class': 'airflow.utils.log.s3_task_
> handler.S3TaskHandler',
> >> >            'formatter': 'airflow.task',
> >> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >> >            's3_log_folder': S3_LOG_FOLDER,
> >> >            'filename_template': FILENAME_TEMPLATE,
> >> >        }
> >> >    },
> >> >    'loggers': {
> >> >        'airflow.task': {
> >> >            'handlers': ['s3.task'],
> >> >            'level': LOG_LEVEL,
> >> >            'propagate': False,
> >> >        },
> >> >        'airflow.task_runner': {
> >> >            'handlers': ['s3.task'],
> >> >            'level': LOG_LEVEL,
> >> >            'propagate': True,
> >> >        },
> >> >        'airflow': {
> >> >            'handlers': ['console'],
> >> >            'level': LOG_LEVEL,
> >> >            'propagate': False,
> >> >        },
> >> >    }
> >> > }
> >> > -----
> >> >
> >> > I never see any task logs in S3, even after completion of all tasks.
> >> >
> >> > While running this in docker since I my executors/workers are on
> >> different
> >> > hosts, when I try to pull up the task logs in the UI I receive the
> >> > following error b/c they there in s3:
> >> >
> >> > Failed to fetch log file from worker.
> >> > HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
> >> > exceeded with url:
> >> >
> >> >
> >> >
> >> > Any additional hints you can provide on what else needs to be done?
> >> >
> >> > Thanks.
> >> >
> >> > Regards,
> >> > David
> >>
> >>
> >
>

Re: Using s3 logging in Airflow 1.9.x

Posted by David Klosowski <da...@thinknear.com>.
Out of curiosity, has anyone had the S3 logging feature work that has
tested it out in the latest 1.9 branch of Airflow?

I'm imagining that anyone with a distributed container environment with
Airflow will want to be able to view the task logs in the UI, and w/o this
feature it won't work as you can't share the task logs at the host level
when the containers are distributed.

Thanks.

Regards,
David

On Sat, Oct 7, 2017 at 10:06 AM, David Klosowski <da...@thinknear.com>
wrote:

> Hi Ash,
>
> Thanks for the response .
>
> I neglected to post that I do in fact have that config specified in the
> airflow.cfg
>
> My file logger shows the custom formatting I have set and I see a log line
> in the console from the scheduler:
>
> {logging_config.py:42} INFO - Successfully imported user-defined logging
> config from
>
> Cheers,
> David
>
>
> On Sat, Oct 7, 2017 at 7:42 AM, Ash Berlin-Taylor <
> ash_airflowlist@firemirror.com> wrote:
>
>> It could be that you have created a custom logging file, but you haven't
>> specified it in your airflow.cfg:
>>
>> ```
>> logging_config_class=mymodule.LOGGING_CONFIG
>> ```
>>
>>
>>
>>
>> > On 7 Oct 2017, at 00:11, David Klosowski <da...@thinknear.com> wrote:
>> >
>> > Hey Airflow Devs:
>> >
>> > How is s3 logging supposed to work in Airflow 1.9.0?
>> >
>> > I've followed the *UPDATING.md* guide for the new setup of logging and
>> > while I can use my custom logging configuration module to format the
>> files
>> > written to the host, the s3 logging doesn't appear to work as I don't
>> see
>> > anything in s3.
>> >
>> > *> airflow.cfg*
>> >
>> > task_log_reader = s3.task
>> >
>> >
>> > *> custom logging module added to PYTHONPATH with __init__.py in
>> directory*
>> > ----
>> > import os
>> >
>> > from airflow import configuration as conf
>> >
>> > LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
>> > LOG_FORMAT = conf.get('core', 'log_format')
>> >
>> > BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
>> >
>> > FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
>> > try_number }}.log'
>> >
>> > STAGE = os.getenv('STAGE')
>> >
>> > S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
>> >
>> > LOGGING_CONFIG = {
>> >    'version': 1,
>> >    'disable_existing_loggers': False,
>> >    'formatters': {
>> >        'airflow.task': {
>> >            'format': LOG_FORMAT,
>> >        },
>> >    },
>> >    'handlers': {
>> >        'console': {
>> >            'class': 'logging.StreamHandler',
>> >            'formatter': 'airflow.task',
>> >            'stream': 'ext://sys.stdout'
>> >        },
>> >        'file.task': {
>> >            'class': 'airflow.utils.log.file_task_h
>> andler.FileTaskHandler',
>> >            'formatter': 'airflow.task',
>> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>> >            'filename_template': FILENAME_TEMPLATE,
>> >        },
>> >        # When using s3 or gcs, provide a customized LOGGING_CONFIG
>> >        # in airflow_local_settings within your PYTHONPATH, see
>> UPDATING.md
>> >        # for details
>> >        's3.task': {
>> >            'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
>> >            'formatter': 'airflow.task',
>> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>> >            's3_log_folder': S3_LOG_FOLDER,
>> >            'filename_template': FILENAME_TEMPLATE,
>> >        }
>> >    },
>> >    'loggers': {
>> >        'airflow.task': {
>> >            'handlers': ['s3.task'],
>> >            'level': LOG_LEVEL,
>> >            'propagate': False,
>> >        },
>> >        'airflow.task_runner': {
>> >            'handlers': ['s3.task'],
>> >            'level': LOG_LEVEL,
>> >            'propagate': True,
>> >        },
>> >        'airflow': {
>> >            'handlers': ['console'],
>> >            'level': LOG_LEVEL,
>> >            'propagate': False,
>> >        },
>> >    }
>> > }
>> > -----
>> >
>> > I never see any task logs in S3, even after completion of all tasks.
>> >
>> > While running this in docker since I my executors/workers are on
>> different
>> > hosts, when I try to pull up the task logs in the UI I receive the
>> > following error b/c they there in s3:
>> >
>> > Failed to fetch log file from worker.
>> > HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
>> > exceeded with url:
>> >
>> >
>> >
>> > Any additional hints you can provide on what else needs to be done?
>> >
>> > Thanks.
>> >
>> > Regards,
>> > David
>>
>>
>

Re: Using s3 logging in Airflow 1.9.x

Posted by David Klosowski <da...@thinknear.com>.
Hi Ash,

Thanks for the response .

I neglected to post that I do in fact have that config specified in the
airflow.cfg

My file logger shows the custom formatting I have set and I see a log line
in the console from the scheduler:

{logging_config.py:42} INFO - Successfully imported user-defined logging
config from

Cheers,
David


On Sat, Oct 7, 2017 at 7:42 AM, Ash Berlin-Taylor <
ash_airflowlist@firemirror.com> wrote:

> It could be that you have created a custom logging file, but you haven't
> specified it in your airflow.cfg:
>
> ```
> logging_config_class=mymodule.LOGGING_CONFIG
> ```
>
>
>
>
> > On 7 Oct 2017, at 00:11, David Klosowski <da...@thinknear.com> wrote:
> >
> > Hey Airflow Devs:
> >
> > How is s3 logging supposed to work in Airflow 1.9.0?
> >
> > I've followed the *UPDATING.md* guide for the new setup of logging and
> > while I can use my custom logging configuration module to format the
> files
> > written to the host, the s3 logging doesn't appear to work as I don't see
> > anything in s3.
> >
> > *> airflow.cfg*
> >
> > task_log_reader = s3.task
> >
> >
> > *> custom logging module added to PYTHONPATH with __init__.py in
> directory*
> > ----
> > import os
> >
> > from airflow import configuration as conf
> >
> > LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
> > LOG_FORMAT = conf.get('core', 'log_format')
> >
> > BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
> >
> > FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
> > try_number }}.log'
> >
> > STAGE = os.getenv('STAGE')
> >
> > S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
> >
> > LOGGING_CONFIG = {
> >    'version': 1,
> >    'disable_existing_loggers': False,
> >    'formatters': {
> >        'airflow.task': {
> >            'format': LOG_FORMAT,
> >        },
> >    },
> >    'handlers': {
> >        'console': {
> >            'class': 'logging.StreamHandler',
> >            'formatter': 'airflow.task',
> >            'stream': 'ext://sys.stdout'
> >        },
> >        'file.task': {
> >            'class': 'airflow.utils.log.file_task_
> handler.FileTaskHandler',
> >            'formatter': 'airflow.task',
> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >            'filename_template': FILENAME_TEMPLATE,
> >        },
> >        # When using s3 or gcs, provide a customized LOGGING_CONFIG
> >        # in airflow_local_settings within your PYTHONPATH, see
> UPDATING.md
> >        # for details
> >        's3.task': {
> >            'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
> >            'formatter': 'airflow.task',
> >            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
> >            's3_log_folder': S3_LOG_FOLDER,
> >            'filename_template': FILENAME_TEMPLATE,
> >        }
> >    },
> >    'loggers': {
> >        'airflow.task': {
> >            'handlers': ['s3.task'],
> >            'level': LOG_LEVEL,
> >            'propagate': False,
> >        },
> >        'airflow.task_runner': {
> >            'handlers': ['s3.task'],
> >            'level': LOG_LEVEL,
> >            'propagate': True,
> >        },
> >        'airflow': {
> >            'handlers': ['console'],
> >            'level': LOG_LEVEL,
> >            'propagate': False,
> >        },
> >    }
> > }
> > -----
> >
> > I never see any task logs in S3, even after completion of all tasks.
> >
> > While running this in docker since I my executors/workers are on
> different
> > hosts, when I try to pull up the task logs in the UI I receive the
> > following error b/c they there in s3:
> >
> > Failed to fetch log file from worker.
> > HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
> > exceeded with url:
> >
> >
> >
> > Any additional hints you can provide on what else needs to be done?
> >
> > Thanks.
> >
> > Regards,
> > David
>
>

Re: Using s3 logging in Airflow 1.9.x

Posted by Ash Berlin-Taylor <as...@firemirror.com>.
It could be that you have created a custom logging file, but you haven't specified it in your airflow.cfg:

```
logging_config_class=mymodule.LOGGING_CONFIG
```




> On 7 Oct 2017, at 00:11, David Klosowski <da...@thinknear.com> wrote:
> 
> Hey Airflow Devs:
> 
> How is s3 logging supposed to work in Airflow 1.9.0?
> 
> I've followed the *UPDATING.md* guide for the new setup of logging and
> while I can use my custom logging configuration module to format the files
> written to the host, the s3 logging doesn't appear to work as I don't see
> anything in s3.
> 
> *> airflow.cfg*
> 
> task_log_reader = s3.task
> 
> 
> *> custom logging module added to PYTHONPATH with __init__.py in directory*
> ----
> import os
> 
> from airflow import configuration as conf
> 
> LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
> LOG_FORMAT = conf.get('core', 'log_format')
> 
> BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
> 
> FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
> try_number }}.log'
> 
> STAGE = os.getenv('STAGE')
> 
> S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
> 
> LOGGING_CONFIG = {
>    'version': 1,
>    'disable_existing_loggers': False,
>    'formatters': {
>        'airflow.task': {
>            'format': LOG_FORMAT,
>        },
>    },
>    'handlers': {
>        'console': {
>            'class': 'logging.StreamHandler',
>            'formatter': 'airflow.task',
>            'stream': 'ext://sys.stdout'
>        },
>        'file.task': {
>            'class': 'airflow.utils.log.file_task_handler.FileTaskHandler',
>            'formatter': 'airflow.task',
>            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>            'filename_template': FILENAME_TEMPLATE,
>        },
>        # When using s3 or gcs, provide a customized LOGGING_CONFIG
>        # in airflow_local_settings within your PYTHONPATH, see UPDATING.md
>        # for details
>        's3.task': {
>            'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
>            'formatter': 'airflow.task',
>            'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>            's3_log_folder': S3_LOG_FOLDER,
>            'filename_template': FILENAME_TEMPLATE,
>        }
>    },
>    'loggers': {
>        'airflow.task': {
>            'handlers': ['s3.task'],
>            'level': LOG_LEVEL,
>            'propagate': False,
>        },
>        'airflow.task_runner': {
>            'handlers': ['s3.task'],
>            'level': LOG_LEVEL,
>            'propagate': True,
>        },
>        'airflow': {
>            'handlers': ['console'],
>            'level': LOG_LEVEL,
>            'propagate': False,
>        },
>    }
> }
> -----
> 
> I never see any task logs in S3, even after completion of all tasks.
> 
> While running this in docker since I my executors/workers are on different
> hosts, when I try to pull up the task logs in the UI I receive the
> following error b/c they there in s3:
> 
> Failed to fetch log file from worker.
> HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries
> exceeded with url:
> 
> 
> 
> Any additional hints you can provide on what else needs to be done?
> 
> Thanks.
> 
> Regards,
> David


Re: Using s3 logging in Airflow 1.9.x

Posted by David Klosowski <da...@thinknear.com>.
Not sure if this is the issue, since I'm using the CeleryExecutor

https://issues.apache.org/jira/browse/AIRFLOW-1667

Interestingly enough it doesn't work with the LocalExecutor either.

Regards,
David

On Fri, Oct 6, 2017 at 4:11 PM, David Klosowski <da...@thinknear.com>
wrote:

> Hey Airflow Devs:
>
> How is s3 logging supposed to work in Airflow 1.9.0?
>
> I've followed the *UPDATING.md* guide for the new setup of logging and
> while I can use my custom logging configuration module to format the files
> written to the host, the s3 logging doesn't appear to work as I don't see
> anything in s3.
>
> *> airflow.cfg*
>
> task_log_reader = s3.task
>
>
> *> custom logging module added to PYTHONPATH with __init__.py in directory*
> ----
> import os
>
> from airflow import configuration as conf
>
> LOG_LEVEL = conf.get('core', 'LOGGING_LEVEL').upper()
> LOG_FORMAT = conf.get('core', 'log_format')
>
> BASE_LOG_FOLDER = conf.get('core', 'BASE_LOG_FOLDER')
>
> FILENAME_TEMPLATE = '{{ ti.dag_id }}/{{ ti.task_id }}/{{ ts }}/{{
> try_number }}.log'
>
> STAGE = os.getenv('STAGE')
>
> S3_LOG_FOLDER = 's3://tn-testing-bucket/dk
>
> LOGGING_CONFIG = {
>     'version': 1,
>     'disable_existing_loggers': False,
>     'formatters': {
>         'airflow.task': {
>             'format': LOG_FORMAT,
>         },
>     },
>     'handlers': {
>         'console': {
>             'class': 'logging.StreamHandler',
>             'formatter': 'airflow.task',
>             'stream': 'ext://sys.stdout'
>         },
>         'file.task': {
>             'class': 'airflow.utils.log.file_task_
> handler.FileTaskHandler',
>             'formatter': 'airflow.task',
>             'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>             'filename_template': FILENAME_TEMPLATE,
>         },
>         # When using s3 or gcs, provide a customized LOGGING_CONFIG
>         # in airflow_local_settings within your PYTHONPATH, see UPDATING.md
>         # for details
>         's3.task': {
>             'class': 'airflow.utils.log.s3_task_handler.S3TaskHandler',
>             'formatter': 'airflow.task',
>             'base_log_folder': os.path.expanduser(BASE_LOG_FOLDER),
>             's3_log_folder': S3_LOG_FOLDER,
>             'filename_template': FILENAME_TEMPLATE,
>         }
>     },
>     'loggers': {
>         'airflow.task': {
>             'handlers': ['s3.task'],
>             'level': LOG_LEVEL,
>             'propagate': False,
>         },
>         'airflow.task_runner': {
>             'handlers': ['s3.task'],
>             'level': LOG_LEVEL,
>             'propagate': True,
>         },
>         'airflow': {
>             'handlers': ['console'],
>             'level': LOG_LEVEL,
>             'propagate': False,
>         },
>     }
> }
> -----
>
> I never see any task logs in S3, even after completion of all tasks.
>
> While running this in docker since I my executors/workers are on different
> hosts, when I try to pull up the task logs in the UI I receive the
> following error b/c they there in s3:
>
> Failed to fetch log file from worker. HTTPConnectionPool(host='f0cf9e596af6', port=8793): Max retries exceeded with url:
>
>
>
> Any additional hints you can provide on what else needs to be done?
>
> Thanks.
>
> Regards,
> David
>
>