You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Kengo Seki (JIRA)" <ji...@apache.org> on 2018/05/21 16:36:00 UTC
[jira] [Created] (AIRFLOW-2500) Fix MySqlToHiveTransfer to transfer
unsigned type properly
Kengo Seki created AIRFLOW-2500:
-----------------------------------
Summary: Fix MySqlToHiveTransfer to transfer unsigned type properly
Key: AIRFLOW-2500
URL: https://issues.apache.org/jira/browse/AIRFLOW-2500
Project: Apache Airflow
Issue Type: Bug
Components: operators
Reporter: Kengo Seki
Assignee: Kengo Seki
Given the following table,
{code}
mysql> USE airflow_ci
Database changed
mysql> DESC users;
+-------+------------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+------------------+------+-----+---------+-------+
| id | int(10) unsigned | YES | | NULL | |
+-------+------------------+------+-----+---------+-------+
1 row in set (0.00 sec)
mysql> SELECT * FROM users;
+------------+
| id |
+------------+
| 2147483647 |
| 2147483648 |
+------------+
2 rows in set (0.00 sec)
{code}
executing MySqlToHiveTransfer:
{code}
In [1]: from airflow.operators.mysql_to_hive import MySqlToHiveTransfer
...: t = MySqlToHiveTransfer(sql="SELECT * FROM airflow_ci.users", hive_table="users", recreate=True, task_id="t")
...: t.execute(None)
...:
[2018-05-21 12:14:09,137] {base_hook.py:83} INFO - Using connection to: localhost
[2018-05-21 12:14:09,140] {base_hook.py:83} INFO - Using connection to: localhost
[2018-05-21 12:14:09,146] {hive_hooks.py:427} INFO - DROP TABLE IF EXISTS users;
CREATE TABLE IF NOT EXISTS users (
id INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ''
STORED AS textfile
;
(snip)
[2018-05-21 12:14:31,667] {hive_hooks.py:233} INFO - Loading data to table default.users
[2018-05-21 12:14:32,364] {hive_hooks.py:233} INFO - Table default.users stats: [numFiles=1, numRows=0, totalSize=24, rawDataSize=0]
[2018-05-21 12:14:32,365] {hive_hooks.py:233} INFO - OK
[2018-05-21 12:14:32,366] {hive_hooks.py:233} INFO - Time taken: 1.299 seconds
{code}
... then the value greater than the upper bound for signed integer is not properly fetched from Hive.
{code}
hive> SELECT * FROM users;
OK
2147483647
NULL
Time taken: 2.461 seconds, Fetched: 2 row(s)
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)