You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tez.apache.org by "Gopal V (JIRA)" <ji...@apache.org> on 2017/08/04 22:00:02 UTC

[jira] [Resolved] (TEZ-3814) Inserts into a bucketed table fail randomly with Hive on Tez

     [ https://issues.apache.org/jira/browse/TEZ-3814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gopal V resolved TEZ-3814.
--------------------------
    Resolution: Not A Bug

> Inserts into a bucketed table fail randomly with Hive on Tez
> ------------------------------------------------------------
>
>                 Key: TEZ-3814
>                 URL: https://issues.apache.org/jira/browse/TEZ-3814
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.0
>            Reporter: Anant Mittal
>              Labels: Bucketing, Hive, Tez
>
> The MAP phase for Inserts into a bucketed table randomly fails with the error "Vertex <vertex_id> [Map 1] failed as task <task_id> failed after vertex succeeded.]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:0".
> The task fails because it fails for all attempts with "<attempt_id> being failed for too many output errors. failureFraction=0.2, MAX_ALLOWED_OUTPUT_FAILURES_FRACTION=0.1, uniquefailedOutputReports=1, MAX_ALLOWED_OUTPUT_FAILURES=10, MAX_ALLOWED_TIME_FOR_TASK_READ_ERROR_SEC=300, readErrorTimespan=0"
> This happens more often if the table is ACID enabled and a delete operation is performed before the inserts.
> I have tried the following:
> Changed tez.am.launch.cmd-opts, tez.task.launch.cmd-opts and hive.tez.java.opts to use parallel GC.
> tez.runtime.shuffle.max.allowed.failed.fetch.fraction = 0.95
> tez.runtime.shuffle.failed.check.since-last.completion=false
> tez.runtime.shuffle.fetch.buffer.percent = 0.1
> tez.runtime.shuffle.memory.limit.percent = 0.25
> tez.runtime.shuffle.ssl.enable=false
> Deleted ".../usercache/<user>/filecache" and ".../usercache/<user>/appcache"
> Please advise as to what might be a solution and if anyone else is able to successfully run large number of inserts on a bucketed table via Tez. I am using HDP 2.6 dsitribution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)