You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kengo Seki (JIRA)" <ji...@apache.org> on 2018/05/16 14:39:00 UTC

[jira] [Created] (AIRFLOW-2471) Fix HiveCliHook.load_df to use unused parameters

Kengo Seki created AIRFLOW-2471:
-----------------------------------

             Summary: Fix HiveCliHook.load_df to use unused parameters
                 Key: AIRFLOW-2471
                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2471
             Project: Apache Airflow
          Issue Type: Bug
          Components: hive_hooks, hooks
            Reporter: Kengo Seki
            Assignee: Kengo Seki


HiveCliHook.load_df has parameters called create and recreate:


{code}
    def load_df(
            self,
            df,
            table,
            create=True,
            recreate=False,

(snip)

        :param create: whether to create the table if it doesn't exist
        :type create: bool
        :param recreate: whether to drop and recreate the table at every
            execution
        :type recreate: bool
{code}

but these are already used as the default value. For example, even if specifying {{recreate=True}}, {{DROP TABLE}} is not executed before {{CREATE TABLE}}.

{code}
In [1]: import pandas as pd

In [2]: from airflow.hooks.hive_hooks import HiveCliHook

In [3]: df = pd.DataFrame({"c": range(0, 10)})

In [4]: h = HiveCliHook()
[2018-05-16 10:27:55,814] {base_hook.py:83} INFO - Using connection to: localhost

In [5]: h.load_df(df, "t", recreate=True)
[2018-05-16 10:28:17,351] {hive_hooks.py:424} INFO - CREATE TABLE IF NOT EXISTS t (
c BIGINT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS textfile
;
[2018-05-16 10:28:17,353] {hive_hooks.py:217} INFO - beeline -u jdbc:hive2://localhost:10000/default;auth=none -f /tmp/airflow_hiveop__zc0kY/tmp99TFZK
[2018-05-16 10:28:19,730] {hive_hooks.py:232} INFO - Connecting to jdbc:hive2://localhost:10000/default;auth=none
[2018-05-16 10:28:20,127] {hive_hooks.py:232} INFO - Connected to: Apache Hive (version 1.2.1)
[2018-05-16 10:28:20,128] {hive_hooks.py:232} INFO - Driver: Hive JDBC (version 1.2.1)
[2018-05-16 10:28:20,129] {hive_hooks.py:232} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2018-05-16 10:28:20,205] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> USE default;
[2018-05-16 10:28:20,446] {hive_hooks.py:232} INFO - No rows affected (0.234 seconds)
[2018-05-16 10:28:20,481] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> CREATE TABLE IF NOT EXISTS t (
[2018-05-16 10:28:20,485] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> c BIGINT)
[2018-05-16 10:28:20,491] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> ROW FORMAT DELIMITED
[2018-05-16 10:28:20,497] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> FIELDS TERMINATED BY ','
[2018-05-16 10:28:20,508] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> STORED AS textfile
[2018-05-16 10:28:20,582] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> ;No rows affected (0.074 seconds)
[2018-05-16 10:28:20,597] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default>
[2018-05-16 10:28:20,598] {hive_hooks.py:232} INFO - Closing: 0: jdbc:hive2://localhost:10000/default;auth=none

(snip)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)