You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Harsh J (JIRA)" <ji...@apache.org> on 2016/11/15 08:43:58 UTC

[jira] [Assigned] (HIVE-11325) Infinite loop in HiveHFileOutputFormat

     [ https://issues.apache.org/jira/browse/HIVE-11325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Harsh J reassigned HIVE-11325:
------------------------------

    Assignee:     (was: Harsh J)

> Infinite loop in HiveHFileOutputFormat
> --------------------------------------
>
>                 Key: HIVE-11325
>                 URL: https://issues.apache.org/jira/browse/HIVE-11325
>             Project: Hive
>          Issue Type: Bug
>          Components: HBase Handler
>    Affects Versions: 1.0.0
>            Reporter: Harsh J
>         Attachments: HIVE-11325.patch
>
>
> No idea why {{hbase_handler_bulk.q}} does not catch this if its being run regularly in Hive builds, but here's the gist of the issue:
> The condition at https://github.com/apache/hive/blob/master/hbase-handler/src/java/org/apache/hadoop/hive/hbase/HiveHFileOutputFormat.java#L152-L164 indicates that we will infinitely loop until we find a file whose last path component (the name) is equal to the column family name.
> In execution, however, the iteration enters an actual infinite loop cause the file we end up considering as the srcDir name, is actually the region file, whose name will never match the family name.
> This is an example of the IPC the listing loop of a 100% progress task gets stuck in:
> {code}
> 2015-07-21 10:32:20,662 TRACE [main] org.apache.hadoop.ipc.ProtobufRpcEngine: 1: Call -> cdh54.vm/172.16.29.132:8020: getListing {src: "/user/hive/warehouse/hbase_test/_temporary/1/_temporary/attempt_1436935612068_0011_m_000000_0/family/97112ac1c09548ae87bd85af072d2e8c" startAfter: "" needLocation: false}
> 2015-07-21 10:32:20,662 DEBUG [IPC Parameter Sending Thread #1] org.apache.hadoop.ipc.Client: IPC Client (1551465414) connection to cdh54.vm/172.16.29.132:8020 from hive sending #510346
> 2015-07-21 10:32:20,662 DEBUG [IPC Client (1551465414) connection to cdh54.vm/172.16.29.132:8020 from hive] org.apache.hadoop.ipc.Client: IPC Client (1551465414) connection to cdh54.vm/172.16.29.132:8020 from hive got value #510346
> 2015-07-21 10:32:20,662 DEBUG [main] org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing took 0ms
> 2015-07-21 10:32:20,662 TRACE [main] org.apache.hadoop.ipc.ProtobufRpcEngine: 1: Response <- cdh54.vm/172.16.29.132:8020: getListing {dirList { partialListing { fileType: IS_FILE path: "" length: 863 permission { perm: 4600 } owner: "hive" group: "hive" modification_time: 1437454718130 access_time: 1437454717973 block_replication: 1 blocksize: 134217728 fileId: 33960 childrenNum: 0 storagePolicy: 0 } remainingEntries: 0 }}
> {code}
> The path we are getting out of the listing results is {{/user/hive/warehouse/hbase_test/_temporary/1/_temporary/attempt_1436935612068_0011_m_000000_0/family/97112ac1c09548ae87bd85af072d2e8c}}, but instead of checking the path's parent {{family}} we're instead looping infinitely over its hashed filename {{97112ac1c09548ae87bd85af072d2e8c}} cause it does not match {{family}}.
> It stays in the infinite loop therefore, until the MR framework kills it away due to an idle task timeout (and then since the subsequent task attempts fail outright, the job fails).
> While doing a {{getPath().getParent()}} will resolve that, is that infinite loop even necessary? Especially given the fact that we throw exceptions if there are no entries or there is more than one entry.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)