You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/07/21 13:07:55 UTC

[GitHub] [airflow] wavewater opened a new issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

wavewater opened a new issue #17135:
URL: https://github.com/apache/airflow/issues/17135


   
   When calling the exasol hooks get_pandas_df function (https://github.com/apache/airflow/blob/main/airflow/providers/exasol/hooks/exasol.py) I noticed that it does not return a pandas dataframe. It returns None. In fact the function definition type hint explicitly states that None is returned. But the name of the function suggests otherwise. The name get_pandas_df implies that it should return a dataframe and not None.
   
   I think that it would make more sense if get_pandas_df would indeed return a dataframe as the name is alluring to. So the code should be like this:
   
   `def get_pandas_df(self, sql: Union[str, list], parameters: Optional[dict] = None, **kwargs) -> pd.DataFrame:
   ... some code ...
   with closing(self.get_conn()) as conn:
   df=conn.export_to_pandas(sql, query_params=parameters, **kwargs)
   return df`
   
   INSTEAD OF:
   
   `def get_pandas_df(self, sql: Union[str, list], parameters: Optional[dict] = None, **kwargs) -> None:
   ... some code ...
   with closing(self.get_conn()) as conn:
   conn.export_to_pandas(sql, query_params=parameters, **kwargs)`
   
   **Apache Airflow version**: 2.1.0
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl version`): Not using Kubernetes
   
   **Environment**:Official Airflow-Docker Image
   
   - **Cloud provider or hardware configuration**: no cloud - docker host (DELL Server with 48 Cores, 512GB RAM and many TB storage)
   - **OS** (e.g. from /etc/os-release):Official Airflow-Docker Image on CentOS 7 Host
   - **Kernel** (e.g. `uname -a`): Linux cad18b35be00 3.10.0-1160.21.1.el7.x86_64 #1 SMP Tue Mar 16 18:28:22 UTC 2021 x86_64 GNU/Linux
   - **Install tools**: only docker
   - **Others**:
   
   **What happened**:
   You can replicate the findings with following dag file:
   
   import datetime
   
   from airflow import DAG
   from airflow.operators.python_operator import PythonOperator
   from airflow.providers.exasol.operators.exasol import ExasolHook
   import pandas as pd
   
   
   default_args = {"owner": "airflow"}
   
   
   def call_exasol_hook(**kwargs):
       #Make connection to Exasol
       hook = ExasolHook(exasol_conn_id='Exasol QA')
       sql = 'select 42;'    
       df = hook.get_pandas_df(sql = sql) 
       return df
       
   with DAG(
       dag_id="exasol_hook_problem",
       start_date=datetime.datetime(2021, 5, 5),
       schedule_interval="@once",
       default_args=default_args,
       catchup=False,
   ) as dag:
         
       set_variable = PythonOperator(
           task_id='call_exasol_hook',
           python_callable=call_exasol_hook
       )
   
   Sorry for the strange code formatting. I do not know how to fix this in the github UI form. 
   Sorry also in case I missed something.
    
   When testing or executing the task via CLI:
   ` airflow tasks test exasol_hook_problem call_exasol_hook 2021-07-20`
   
   the logs show:
   `[2021-07-21 12:53:19,775] {python.py:151} INFO - Done. Returned value was: None`
   
   None was returned - although get_pandas_df was called. A pandas df should have been returned instead.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] wavewater commented on issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

Posted by GitBox <gi...@apache.org>.
wavewater commented on issue #17135:
URL: https://github.com/apache/airflow/issues/17135#issuecomment-884200313


   sure - it will be my first pull request. Can you guide me?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] kaxil closed issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

Posted by GitBox <gi...@apache.org>.
kaxil closed issue #17135:
URL: https://github.com/apache/airflow/issues/17135


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #17135:
URL: https://github.com/apache/airflow/issues/17135#issuecomment-884175515


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] Goodkat commented on issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

Posted by GitBox <gi...@apache.org>.
Goodkat commented on issue #17135:
URL: https://github.com/apache/airflow/issues/17135#issuecomment-903118361


   What is the current status here? May I take it and commit the changes then?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17135:
URL: https://github.com/apache/airflow/issues/17135#issuecomment-884260770


   Sure, let's see what we can do.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] uranusjr commented on issue #17135: ExasolHook get_pandas_df does not return pandas dataframe but None

Posted by GitBox <gi...@apache.org>.
uranusjr commented on issue #17135:
URL: https://github.com/apache/airflow/issues/17135#issuecomment-884180729


   Hi, would you be interested in putting together a pull request for this?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org