You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Galak (JIRA)" <ji...@apache.org> on 2019/01/21 21:22:00 UTC

[jira] [Commented] (AIRFLOW-2508) Handle non string types in render_template_from_field

    [ https://issues.apache.org/jira/browse/AIRFLOW-2508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16748225#comment-16748225 ] 

Galak commented on AIRFLOW-2508:
--------------------------------

[Pull Request #4292|https://github.com/apache/incubator-airflow/pull/4292] is waiting for a code review.

> Handle non string types in render_template_from_field
> -----------------------------------------------------
>
>                 Key: AIRFLOW-2508
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2508
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: models
>    Affects Versions: 2.0.0
>            Reporter: Eugene Brown
>            Assignee: Galak
>            Priority: Minor
>              Labels: easyfix, newbie
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> The render_template_from_field method of the BaseOperator class raises an exception when it encounters content that is not a string_type, list, tuple or dict.
> Example exception:
> {noformat}
> airflow.exceptions.AirflowException: Type '<type 'int'>' used for parameter 'job_flow_overrides[Instances][InstanceGroups][InstanceCount]' is not supported for templating{noformat}
> I propose instead that when it encounters content of other types it returns the content unchanged, rather than raising an exception.
> Consider this case: I extended the EmrCreateJobFlowOperator to make the job_flow_overrides argument a templatable field. job_flow_overrides is a dictionary with a mix of strings, integers and booleans for values.
> When I extended the class as such:
> {code:java}
> class EmrCreateJobFlowOperatorTemplateOverrides(EmrCreateJobFlowOperator):
>     template_fields = ['job_flow_overrides']{code}
> And added a task to my dag with this format:
> {code:java}
> step_create_cluster = EmrCreateJobFlowOperatorTemplateOverrides(
>     task_id="create_cluster",
>     job_flow_overrides={
>         "Name": "my-cluster {{ dag_run.conf['run_date'] }}",
>         "Instances": {
>             "InstanceGroups": [
>                 {
>                     "Name": "Master nodes",
>                     "InstanceType": "c3.4xlarge",
>                     "InstanceCount": 1
>                 },
>                 {
>                     "Name": "Slave nodes",
>                     "InstanceType": "c3.4xlarge",
>                     "InstanceCount": 4
>                 },
>                 "TerminationProtected": False
>             ]
>         },
>         "BootstrapActions": [{
>              "Name": "Custom action",
>              "ScriptBootstrapAction": {
>                  "Path": "s3://repo/{{ dag_run.conf['branch'] }}/requirements.txt"
>              }
>         }],
>    },
>    aws_conn_id='aws_default',
>    emr_conn_id='aws_default',
>    dag=dag
> )
> {code}
> The exception I gave above was raised and the step failed. I think it would be preferable for the method to instead pass over numeric and boolean values as users may want to use template_fields in the way I have to template string values in dictionaries or lists of mixed types.
> Here is the render_template_from_field method from the BaseOperator:
> {code:java}
> def render_template_from_field(self, attr, content, context, jinja_env):
>     """
>     Renders a template from a field. If the field is a string, it will
>     simply render the string and return the result. If it is a collection or
>     nested set of collections, it will traverse the structure and render
>     all strings in it.
>     """
>     rt = self.render_template
>     if isinstance(content, six.string_types):
>         result = jinja_env.from_string(content).render(**context)
>     elif isinstance(content, (list, tuple)):
>         result = [rt(attr, e, context) for e in content]
>     elif isinstance(content, dict):
>         result = {
>             k: rt("{}[{}]".format(attr, k), v, context)
>             for k, v in list(content.items())}
>     else:
>         param_type = type(content)
>         msg = (
>             "Type '{param_type}' used for parameter '{attr}' is "
>             "not supported for templating").format(**locals())
>         raise AirflowException(msg)
>     return result{code}
>  I propose that the method returns content unchanged if the content is of one of (int, float, complex, bool) types. So my solution would include an extra elif in the form:
> {code}
> elif isinstance(content, (int, float, complex, bool)):
>     result = content
> {code}
>  Are there any reasons this would be a bad idea?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)