You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/05/21 18:17:00 UTC

[jira] [Commented] (AIRFLOW-2471) Fix HiveCliHook.load_df to use unused parameters

    [ https://issues.apache.org/jira/browse/AIRFLOW-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16482839#comment-16482839 ] 

ASF subversion and git services commented on AIRFLOW-2471:
----------------------------------------------------------

Commit 1db3073374b6fd033651caf1fcb98e743483fa30 in incubator-airflow's branch refs/heads/master from [~sekikn]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=1db3073 ]

[AIRFLOW-2471] Fix HiveCliHook.load_df to use unused parameters

This PR fixes HiveCliHook.load_df to pass
load_file the parameter called create and
recreate, which are currently ignored, as
part of kwargs.

Closes #3390 from sekikn/AIRFLOW-2471


> Fix HiveCliHook.load_df to use unused parameters
> ------------------------------------------------
>
>                 Key: AIRFLOW-2471
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2471
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: hive_hooks, hooks
>            Reporter: Kengo Seki
>            Assignee: Kengo Seki
>            Priority: Major
>             Fix For: 2.0.0
>
>
> HiveCliHook.load_df has parameters called create and recreate:
> {code}
>     def load_df(
>             self,
>             df,
>             table,
>             create=True,
>             recreate=False,
> (snip)
>         :param create: whether to create the table if it doesn't exist
>         :type create: bool
>         :param recreate: whether to drop and recreate the table at every
>             execution
>         :type recreate: bool
> {code}
> but these are already used as the default value. For example, even if specifying {{recreate=True}}, {{DROP TABLE}} is not executed before {{CREATE TABLE}}.
> {code}
> In [1]: import pandas as pd
> In [2]: from airflow.hooks.hive_hooks import HiveCliHook
> In [3]: df = pd.DataFrame({"c": range(0, 10)})
> In [4]: h = HiveCliHook()
> [2018-05-16 10:27:55,814] {base_hook.py:83} INFO - Using connection to: localhost
> In [5]: h.load_df(df, "t", recreate=True)
> [2018-05-16 10:28:17,351] {hive_hooks.py:424} INFO - CREATE TABLE IF NOT EXISTS t (
> c BIGINT)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ','
> STORED AS textfile
> ;
> [2018-05-16 10:28:17,353] {hive_hooks.py:217} INFO - beeline -u jdbc:hive2://localhost:10000/default;auth=none -f /tmp/airflow_hiveop__zc0kY/tmp99TFZK
> [2018-05-16 10:28:19,730] {hive_hooks.py:232} INFO - Connecting to jdbc:hive2://localhost:10000/default;auth=none
> [2018-05-16 10:28:20,127] {hive_hooks.py:232} INFO - Connected to: Apache Hive (version 1.2.1)
> [2018-05-16 10:28:20,128] {hive_hooks.py:232} INFO - Driver: Hive JDBC (version 1.2.1)
> [2018-05-16 10:28:20,129] {hive_hooks.py:232} INFO - Transaction isolation: TRANSACTION_REPEATABLE_READ
> [2018-05-16 10:28:20,205] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> USE default;
> [2018-05-16 10:28:20,446] {hive_hooks.py:232} INFO - No rows affected (0.234 seconds)
> [2018-05-16 10:28:20,481] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> CREATE TABLE IF NOT EXISTS t (
> [2018-05-16 10:28:20,485] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> c BIGINT)
> [2018-05-16 10:28:20,491] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> ROW FORMAT DELIMITED
> [2018-05-16 10:28:20,497] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> FIELDS TERMINATED BY ','
> [2018-05-16 10:28:20,508] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> STORED AS textfile
> [2018-05-16 10:28:20,582] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default> ;No rows affected (0.074 seconds)
> [2018-05-16 10:28:20,597] {hive_hooks.py:232} INFO - 0: jdbc:hive2://localhost:10000/default>
> [2018-05-16 10:28:20,598] {hive_hooks.py:232} INFO - Closing: 0: jdbc:hive2://localhost:10000/default;auth=none
> (snip)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)