You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/12/22 14:46:13 UTC

[GitHub] [airflow] nicnguyen3103 commented on issue #26931: Templating (like {{ ds }} ) stopped working in papermill after upgrade from 2.3.x to 2.4.x

nicnguyen3103 commented on issue #26931:
URL: https://github.com/apache/airflow/issues/26931#issuecomment-1362925375

   I found the issue with the code of Papermill (or Airflow). The render_template method apparently **execute after the init method but before the execute() method**. As papermill init the self.inlets in the constructor with the self.parameters, the self.inlets create a Notebook object with **pre-rendered params and never got updated**. I will visualize the flow with the code here:
   ```
   class PapermillOperator(BaseOperator):
       """
       Executes a jupyter notebook through papermill that is annotated with parameters
   
       :param input_nb: input notebook (can also be a NoteBook or a File inlet)
       :param output_nb: output notebook (can also be a NoteBook or File outlet)
       :param parameters: the notebook parameters to set
       :param kernel_name: (optional) name of kernel to execute the notebook against
           (ignores kernel name in the notebook document metadata)
       """
   
       supports_lineage = True
   
       template_fields: Sequence[str] = ('input_nb', 'output_nb', 'parameters', 'kernel_name', 'language_name')
   
       def __init__(
           self,
           *,
           input_nb: Optional[str] = None,
           output_nb: Optional[str] = None,
           parameters: Optional[Dict] = None,
           kernel_name: Optional[str] = None,
           language_name: Optional[str] = None,
           **kwargs,
       ) -> None:
           super().__init__(**kwargs)
   
           self.input_nb = input_nb
           self.output_nb = output_nb
           self.parameters = parameters
           self.kernel_name = kernel_name
           self.language_name = language_name
           
           # the self.inlets was populated with pre-rendered self.parameters 
           if input_nb:
               self.inlets.append(NoteBook(url=input_nb, parameters=self.parameters))
           if output_nb:
               self.outlets.append(NoteBook(url=output_nb))
   
       def execute(self, context: 'Context'):
           # the self.parameters have been updated, but self.inlets does never got updated.
           if not self.inlets or not self.outlets:
               raise ValueError("Input notebook or output notebook is not specified")
   
           for i, item in enumerate(self.inlets):
               pm.execute_notebook(
                   item.url,
                   self.outlets[i].url,
                   parameters=item.parameters,
                   progress_bar=False,
                   report_mode=True,
                   kernel_name=self.kernel_name,
                   language=self.language_name,
               )
   ```
   The problem is fixed if you modify the execute method like this by updating self.inlets and self.outlets attribute inside the execute function, after the render function has been executed
   ```
       def execute(self, context: 'Context'):
           if not self.input_nb or not self.output_nb:
               raise ValueError("Input notebook or output notebook is not specified")
           self.inlets.append(NoteBook(url=self.input_nb, 
                                       parameters=self.parameters))
           self.outlets.append(NoteBook(url=self.output_nb))
   
           for i, item in enumerate(self.inlets):
               pm.execute_notebook(
                   item.url,
                   self.outlets[i].url,
                   parameters=item.parameters,
                   progress_bar=False,
                   report_mode=True,
                   kernel_name=self.kernel_name,
                   language=self.language_name,
               )
   ```
   I am currently on airflow 2.5.0, and looking at airflow ti now to see if they change the render.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org