You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ambari.apache.org by "Akhil S Naik (JIRA)" <ji...@apache.org> on 2019/05/24 10:31:00 UTC

[jira] [Created] (AMBARI-25285) Ambari always copies and overrwrites mapreduce.tar.gz to hdfs when WebHDFS is not enabled while restarting HiveServer

Akhil S Naik created AMBARI-25285:
-------------------------------------

             Summary: Ambari always copies and overrwrites mapreduce.tar.gz to hdfs when WebHDFS is not enabled while restarting HiveServer
                 Key: AMBARI-25285
                 URL: https://issues.apache.org/jira/browse/AMBARI-25285
             Project: Ambari
          Issue Type: Bug
          Components: ambari-agent
    Affects Versions: 2.7.3
            Reporter: Akhil S Naik
            Assignee: Akhil S Naik


Problem Statement : 

When HiveServer2 is restarted, the startup python script will try to copy /usr/hdp/<version>/hadoop/mapreduce.tar.gz to /hdp/apps/<version>/mapreduce/mapreduce.tar.gz

Mapreduce jobs will fail with the error if the HiveServer2 restart happens and the YARN applications in ACCEPTED state go to RUNNING during the exact same time when the mapreduce.tar.gz file copy happens.

But when WebHDFS is enabled, this problem will never occur as the copying is skipped by Ambari and we can see the below line.


{code:java}
2019-05-23 10:11:18,371 - DFS file /hdp/apps/2.6.5.0-292/mapreduce/mapreduce.tar.gz is identical to /usr/hdp/2.6.5.0-292/hadoop/mapreduce.tar.gz, skipping the copying
{code}

When WebHDFS is disabled in the cluster, then the above line is not printed when starting HiveServer2.


But when WebHDFS is not started it will just overwrite the mapreduce.tar.gz without asking

analysis : 

Looks issue with this part of code : https://github.com/apache/ambari/blob/4eee0f56d2fbfdfb0caace955339bc0c46a85a3c/contrib/fast-hdfs-resource/src/main/java/org/apache/ambari/fast_hdfs_resource/Runner.java#L131

https://github.com/apache/ambari/blob/4eee0f56d2fbfdfb0caace955339bc0c46a85a3c/contrib/fast-hdfs-resource/src/main/java/org/apache/ambari/fast_hdfs_resource/Resource.java#L236

we are just creating the file and overwriting it if exists.
We should do basic check if the file already exists of not before this copy operation and skip if file is same. 

This will save the time of starting hive-server2 and also abnormal failure of mapreduce jobs.






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)