You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Suma Shivaprasad (JIRA)" <ji...@apache.org> on 2018/05/01 00:24:00 UTC

[jira] [Comment Edited] (YARN-8079) Support specify files to be downloaded (localized) before containers launched by YARN

    [ https://issues.apache.org/jira/browse/YARN-8079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459328#comment-16459328 ] 

Suma Shivaprasad edited comment on YARN-8079 at 5/1/18 12:23 AM:
-----------------------------------------------------------------

Uploaded patch after testing with both STATIC/ARCHIVE types with the following changes to the patch

 
 # Modified TestAbstractClientProvider to test validation changes
 # Changed localization destination from "localized" to "resources

 

Service spec to test the above

a. with STATIC
{code}
{
        "name": "sleeper-service",
        "version": "1",
        "components": [{
                "name": "sleep",
                "number_of_containers": 2,
                "launch_command": "sleep 90000",
                "resource": {
                        "cpus": 1,
                        "memory": "256"
                },
                "configuration": {
                        "files": [{
                                "type": "STATIC",
                                "dest_file": "sometextfile",
                                "src_file": "hdfs:///tmp/a.json"
                        }]
                }
        }]
}
{code}
 

b. with ARCHIVE
{code}
{
        "name": "sleeper-service",
        "version": "1",
        "components": [{
                "name": "sleep",
                "number_of_containers": 2,
                "launch_command": "sleep 90000",
                "resource": {
                        "cpus": 1,
                        "memory": "256"
                },
                "configuration": {
                        "files": [{
                                "type": "ARCHIVE",
                                "dest_file": "sometarfile.tar.gz",
                                "src_file": "hdfs:///tmp/a.tar.gz"
                        }]
                }
        }]
}
{code}

 


was (Author: suma.shivaprasad):
Uploaded patch after testing with both STATIC/ARCHIVE types with the following changes to the patch

 
 # Modified TestAbstractClientProvider to test validation changes
 # Changed localization destination from "localized" to "resources

 

Service spec to test the above

{code}

{
 "name": "sleeper-service",
 "version": "1",
 "components": [{
 "name": "sleep",
 "number_of_containers": 2,
 "launch_command": "sleep 90000",
 "resource": {
 "cpus": 1,
 "memory": "256"
 },
 "configuration": {
 "files": [{
 "type": "STATIC",
 "dest_file": "sometextfile",
 "src_file": "hdfs:///tmp/a.json"
 }]
 }
 }]
}

{code}

 

 

> Support specify files to be downloaded (localized) before containers launched by YARN
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-8079
>                 URL: https://issues.apache.org/jira/browse/YARN-8079
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Wangda Tan
>            Assignee: Suma Shivaprasad
>            Priority: Critical
>         Attachments: YARN-8079.001.patch, YARN-8079.002.patch, YARN-8079.003.patch, YARN-8079.004.patch, YARN-8079.005.patch, YARN-8079.006.patch, YARN-8079.007.patch
>
>
> Currently, {{srcFile}} is not respected. {{ProviderUtils}} doesn't properly read srcFile, instead it always construct {{remoteFile}} by using componentDir and fileName of {{destFile}}:
> {code}
> Path remoteFile = new Path(compInstanceDir, fileName);
> {code} 
> To me it is a common use case which services have some files existed in HDFS and need to be localized when components get launched. (For example, if we want to serve a Tensorflow model, we need to localize Tensorflow model (typically not huge, less than GB) to local disk. Otherwise launched docker container has to access HDFS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org