You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Eric Wang (JIRA)" <ji...@apache.org> on 2015/03/12 10:55:38 UTC
[jira] [Created] (HIVE-9940) The standard output of Python reduce
script can not be interpreted correctly by Hive
Eric Wang created HIVE-9940:
-------------------------------
Summary: The standard output of Python reduce script can not be interpreted correctly by Hive
Key: HIVE-9940
URL: https://issues.apache.org/jira/browse/HIVE-9940
Project: Hive
Issue Type: Bug
Components: Hive
Reporter: Eric Wang
use HQL statement like:
FROM (
select_statement
) map_output
INSERT OVERWRITE TABLE table
REDUCE map_output.a, map_output.b
USING 'py_script'
AS col1, col2;
(1)original type
stdout of Python has Records where the 2nd column = 'Meerjungfrau'
527500 Meerjungfrau 25 AO DE 20140704
...
Hive interprets these as:
527500 Meer <null> AO DE 20140704
...
stderr_log interprets these as:
527500 Meerjungfrau 25 AO DE 20140704
(2)change all 'Meerjungfrau' to 'bug' in Python script
stdout of Python has Records where the 2nd column = 'bug'
527500 bug 25 AO DE 20140704
...
Hive interprets these as:
527500 b <null> AO DE 20140704
...
stderr_log interprets these as:
527500 bug 25 AO DE 20140704
(3)put 2nd column to the last column
stdout of Python has Records where the 2nd column = 'Meerjungfrau'
527500 25 AO DE 20140704 Meerjungfrau
...
Hive interprets these as:
527500 25 <null> 20140704 Meerjungfrau
...
stderr_log interprets these as:
527500 25 AO DE 20140704 Meerjungfrau
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)