You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/02/27 21:01:18 UTC

[jira] Commented: (HIVE-308) UNION ALL should create different destination directories for different operands

    [ https://issues.apache.org/jira/browse/HIVE-308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677499#action_12677499 ] 

Zheng Shao commented on HIVE-308:
---------------------------------

The problem of the first query (map-only job) is that we have 2 file sink operators.
See the log:

2009-02-27 11:55:42,528 INFO org.apache.hadoop.metrics.jvm.JvmMetrics: Initializing JVM Metrics with processName=MAP, sessionId=
2009-02-27 11:55:42,595 INFO org.apache.hadoop.mapred.MapTask: numReduceTasks: 0
2009-02-27 11:55:42,612 INFO org.apache.hadoop.mapred.MapTask: split: hdfs://xxxx:9000/warehouse/zshao_lazy/8413_m_000000_0.gz, range: 0-28
2009-02-27 11:55:42,631 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2009-02-27 11:55:42,632 INFO org.apache.hadoop.io.compress.zlib.ZlibFactory: Successfully loaded & initialized native-zlib library
2009-02-27 11:55:42,789 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Initializing Self
2009-02-27 11:55:42,793 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias null-subquery2:a-subquery2:zshao_lazy to work list for file /warehouse/zshao_lazy/8413_m_000000_0.gz
2009-02-27 11:55:42,793 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Adding alias null-subquery1:a-subquery1:zshao_lazy to work list for file /warehouse/zshao_lazy/8413_m_000000_0.gz
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.MapOperator: Got partitions: null
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children:
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children:
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children:
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.ForwardOperator: Initializing Self
2009-02-27 11:55:42,802 INFO org.apache.hadoop.hive.ql.exec.ForwardOperator: Initializing children:
2009-02-27 11:55:42,803 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self
2009-02-27 11:55:42,803 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children:
2009-02-27 11:55:42,803 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self
2009-02-27 11:55:42,804 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: /tmp/hive-zshao/_tmp.92566742.10001.insclause-0/_tmp.10549_m_000000_0
2009-02-27 11:55:42,813 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.ForwardOperator: Initialization Done
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initialization Done
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initialization Done
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing Self
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.TableScanOperator: Initializing children:
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children:
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children:
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.ForwardOperator: Initializing Self
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.ForwardOperator: Initializing children:
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing Self
2009-02-27 11:55:42,817 INFO org.apache.hadoop.hive.ql.exec.SelectOperator: Initializing children:
2009-02-27 11:55:42,818 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Initializing Self
2009-02-27 11:55:42,819 INFO org.apache.hadoop.hive.ql.exec.FileSinkOperator: Writing to temp file: /tmp/hive-zshao/_tmp.92566742.10001.insclause-0/_tmp.10549_m_000000_0



> UNION ALL should create different destination directories for different operands
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-308
>                 URL: https://issues.apache.org/jira/browse/HIVE-308
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Priority: Blocker
>
> The following query hangs:
> {code} 
> select * from (select 1 from zshao_lazy union all select 2 from zshao_lazy) a;
> {code} 
> The following query produce wrong results: (one map-reduce job overwrite/cannot overwrite the result of the other)
> {code} 
> select * from (select 1 as id from zshao_lazy cluster by id union all select 2 as id from zshao_meta) a;
> {code} 
> The reason of both is that the destination directory of the file sink operator conflicts with each other.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.