You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kengo Seki (JIRA)" <ji...@apache.org> on 2018/05/16 14:39:00 UTC
[jira] [Created] (AIRFLOW-2471) Fix HiveCliHook.load_df to use
unused parameters
Kengo Seki created AIRFLOW-2471:
-----------------------------------
Summary: Fix HiveCliHook.load_df to use unused parameters
Key: AIRFLOW-2471
URL: https://issues.apache.org/jira/browse/AIRFLOW-2471
Project: Apache Airflow
Issue Type: Bug
Components: hive_hooks, hooks
Reporter: Kengo Seki
Assignee: Kengo Seki
HiveCliHook.load_df has parameters called create and recreate:
{code}
def load_df(
self,
df,
table,
create=True,
recreate=False,
(snip)
:param create: whether to create the table if it doesn't exist
:type create: bool
:param recreate: whether to drop and recreate the table at every
execution
:type recreate: bool
{code}
but these are already used as the default value. For example, even if specifying {{recreate=True}}, {{DROP TABLE}} is not executed before {{CREATE TABLE}}.
{code}
In [1]: import pandas as pd
In [2]: from airflow.hooks.hive_hooks import HiveCliHook
In [3]: df = pd.DataFrame({"c": range(0, 10)})
In [4]: h = HiveCliHook()
[2018-05-16 10:27:55,814] {base_hook.py:83} INFO - Using connection to: localhost
In [5]: h.load_df(df, "t", recreate=True)
[2018-05-16 10:28:17,351] {hive_hooks.py:424} INFO - CREATE TABLE IF NOT EXISTS t (
c BIGINT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS textfile
;
[2018-05-16 10:28:17,353] {hive_hooks.py:217} INFO - beeline -u jdbc:hive2://localhost:10000/default;auth=none -f /tmp/airflow_hiveop__zc0kY/tmp99TFZK
[2018-05-16 10:28:19,730] {hive_hooks.py:232} INFO - Connecting to jdbc:hive2://localhost:10000/default;auth=none
[2018-05-16 10:28:20,127] {hive_hooks.py:232} INFO - Connected to: Apache Hive (version 1.2.1)
[2018-05-16 10:28:20,128] {hive_hooks.py:232} INFO - Driver: Hive JDBC (version 1.2.1)
[2018-05-16 10:28:20,129] {hive_hooks.py:232} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
[2018-05-16 10:28:20,205] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> USE default;
[2018-05-16 10:28:20,446] {hive_hooks.py:232} INFO - No rows affected (0.234 seconds)
[2018-05-16 10:28:20,481] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> CREATE TABLE IF NOT EXISTS t (
[2018-05-16 10:28:20,485] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> c BIGINT)
[2018-05-16 10:28:20,491] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> ROW FORMAT DELIMITED
[2018-05-16 10:28:20,497] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> FIELDS TERMINATED BY ','
[2018-05-16 10:28:20,508] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> STORED AS textfile
[2018-05-16 10:28:20,582] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> ;No rows affected (0.074 seconds)
[2018-05-16 10:28:20,597] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default>
[2018-05-16 10:28:20,598] {hive_hooks.py:232} INFO - Closing: 0: jdbc:hive2://localhost:10000/default;auth=none
(snip)
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)