You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Zhiyuan Yang (JIRA)" <ji...@apache.org> on 2016/08/01 22:37:21 UTC

[jira] [Updated] (YARN-5396) YARN large file broadcast service

     [ https://issues.apache.org/jira/browse/YARN-5396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zhiyuan Yang updated YARN-5396:
-------------------------------
    Attachment: YARN-broadcast-prototype.patch

Attach the hacky prototype of my last year's internship work. I've made it work on most recent branch-2 revision so that people can try it out(although not recommend until the doc is uploaded). It contains the followings:
1. BitTorrent-based broadcast service as aux service
2. Modified resource localization that makes use of broadcast service and computes md5 of localized file.
3. An example yarn app that simply localizes resource via broadcast service.

I'll attach some documentation about design,implementation and usage later.

> YARN large file broadcast service
> ---------------------------------
>
>                 Key: YARN-5396
>                 URL: https://issues.apache.org/jira/browse/YARN-5396
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Zhiyuan Yang
>            Assignee: Zhiyuan Yang
>         Attachments: YARN-broadcast-prototype.patch
>
>
> In Hadoop and related softwares, there are demands of broadcasting large files. For example, YARN application may localize large jar files on each node; Hive may distribute large tables in fragment-replicate joins; docker integration may broadcast large container image. The current local resource based solution is to put the files on HDFS and let each node download from HDFS, which is inefficient and not scalable. So we want to build a better file transfer service in YARN so that all applications can use it broadcast large file efficiently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org