You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Zhihua Deng (JIRA)" <ji...@apache.org> on 2019/01/25 06:54:00 UTC

[jira] [Commented] (HIVE-10773) MapJoinOperator times out on loading HashTable

    [ https://issues.apache.org/jira/browse/HIVE-10773?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16751954#comment-16751954 ] 

Zhihua Deng commented on HIVE-10773:
------------------------------------

We met the same issue running job on mapreduce. In one of our cases, the 99% stored in dumped hashtable are one-to-one kv mappings. Even though the file is not larger than 10m, the mapper tasks more than half an hour to load the table with about 200,000 keys. 

> MapJoinOperator times out on loading HashTable
> ----------------------------------------------
>
>                 Key: HIVE-10773
>                 URL: https://issues.apache.org/jira/browse/HIVE-10773
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.14.0
>            Reporter: frank luo
>            Priority: Major
>
> When running a map join, depends on data, it might timeout with last two lines in the log as below. And when I do "set mapreduce.task.timeout=600000;", which is defaulted to 300000, the query can go through fine. The size of hashtable file is roughly 400M. 
> 2015-05-20 13:27:03,237 INFO [main] org.apache.hadoop.hive.ql.exec.MapJoinOperator: ******* Load from HashTable for input file: hdfs://nameservice1/tmp/hive/jluo/2ee8914d-1cef-4af4-aac6-51f64d630346/hive_2015-05-20_13-13-35_335_1565066409090716856-1/-mr-10007/000000_0
> 2015-05-20 13:27:03,237 INFO [main] org.apache.hadoop.hive.ql.exec.MapJoinOperator: 	Load back 1 hashtable file 
> from tmp file uri:file:/data/12/hadoop/yarn/local/usercache/xxy/appcache/application_1430337284339_2087
> /container_1430337284339_2087_01_000003/Stage-3.tar.gz/MapJoin-mapfile31--.hashtable 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)