You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2019/05/02 20:31:00 UTC

[jira] [Commented] (TEZ-3420) Parallel queries to HS2/Tez not thread safe (local mode)

    [ https://issues.apache.org/jira/browse/TEZ-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831937#comment-16831937 ] 

Todd Lipcon commented on TEZ-3420:
----------------------------------

Tracked this down to a couple things:
(1) Tez side: LocalClient uses a timestamp to generate appIds, so if multiple clients start at the same time, they''ll conflict and cause problems
(2) Hive side: The current implementation of IOContext uses global statics in the case of Tez, so tasks overwrite each other's IOContexts. Swithcing that to be keyed on an attempt ID (similar to what's done with LLAP) ought to fix it, but that's a Hive-side change.

> Parallel queries to HS2/Tez not thread safe (local mode)
> --------------------------------------------------------
>
>                 Key: TEZ-3420
>                 URL: https://issues.apache.org/jira/browse/TEZ-3420
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.4
>         Environment: HiveServer2 1.2.1 Local mode + Tez
>            Reporter: Uday Chitragar
>            Assignee: Todd Lipcon
>            Priority: Major
>         Attachments: hive.log.submit.gz
>
>
> When running parallel queries (simultaneous connections by two beeline clients to HS2), I get the following exception (full debug attached), interestingly running the queries one after the other completes without any problem. The partition location and actual files seem to get mixed up across the DAGS
>  
> The setup is Hive (1.2.1) and Tez (0.8.4) running in local mode.
> {noformat} 
> 2016-08-25 15:45:41,333 DEBUG [TezTaskEventRouter{attempt_1472136335089_0001_1_01_000000_0}]: impl.ShuffleInputEventHandlerImpl (ShuffleInputEventHandlerImpl.java:processDataMovementEvent(127)) - DME srcIdx: 0, targetIndex: 9, attemptNum
> : 0, payload: [hasEmptyPartitions: true, host: , port: 0, pathComponent: , runDuration: 0]
> 2016-08-25 15:45:41,557 ERROR [TezChild]: tez.MapRecordSource (MapRecordSource.java:processRow(90)) - java.lang.IllegalStateException: Invalid input path file:/acorn/QC/OraExtract/20160131/Devices/Devices_extract_20160229T080613_3
>         at org.apache.hadoop.hive.ql.exec.MapOperator.getNominalPath(MapOperator.java:415)
>         at org.apache.hadoop.hive.ql.exec.MapOperator.cleanUpInputFileChangedOp(MapOperator.java:457)
>         at org.apache.hadoop.hive.ql.exec.Operator.cleanUpInputFileChanged(Operator.java:1069)
>         at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:501)
>  
>  
>  
> 2016-08-25 15:45:41,817 INFO  [TezChild]: io.HiveContextAwareRecordReader (HiveContextAwareRecordReader.java:doNext(326)) –
> Cannot get partition description from file:/acorn/QC/reportlib/VM_ValEdit.24656because cannot find dir = file:/ac
> orn/QC/reportlib/VM_ValEdit.24656 in pathToPartitionInfo: [file:/acorn/QC/OraExtract/20160131/Devices]
> {noformat}
> Perhaps clashing directories for intermediate data might be causing an issue?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)