You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by sabiya kazi <sa...@gmail.com> on 2017/03/07 06:44:25 UTC

Queries on YARN

Hello Folks,
I am new to hadoop and I have few queries w.r.t YARN. I went through
documents
and blog series by HortonWorks

 1. Can we define our custom logical resources in YARN other than Memory &
cpu core?  Is this flexibility there?
 2. What is a role of HDFS in yarn, Can yarn work without HDFS?
 3. HDFS is still used for replication then which component is responsible
for taking backup?
 4. HDFS is required for large data processing application where data  to
be processed is persisted on HDFS
 Can I use YARN without HDFS for solving computational problems?
 5.Is Container's state / running node's image is persisted to provide
failover of running tasks?

Can somebody please answer these questions?


Regards,
-Sabiya

Re: Queries on YARN

Posted by Ravi Prakash <ra...@gmail.com>.
Hi Sabiya,

1. Unfortunately no.
2. Technically yes, although I'm not sure if people have tried it and
succeeded. Which other filesystem were you thinking of using?
3. None. Admins can choose to take backups. Recently HDFS got the snapshot
feature.
4. Hopefully yes.
5. No. Tasks are restarted on other nodes when one node fails. So the tasks
should ideally not have external effects that can't be overwritten.

HTH
Ravi

On Mon, Mar 6, 2017 at 10:44 PM, sabiya kazi <sa...@gmail.com> wrote:

> Hello Folks,
> I am new to hadoop and I have few queries w.r.t YARN. I went through
> documents
> and blog series by HortonWorks
>
>  1. Can we define our custom logical resources in YARN other than Memory &
> cpu core?  Is this flexibility there?
>  2. What is a role of HDFS in yarn, Can yarn work without HDFS?
>  3. HDFS is still used for replication then which component is responsible
> for taking backup?
>  4. HDFS is required for large data processing application where data  to
> be processed is persisted on HDFS
>  Can I use YARN without HDFS for solving computational problems?
>  5.Is Container's state / running node's image is persisted to provide
> failover of running tasks?
>
> Can somebody please answer these questions?
>
>
> Regards,
> -Sabiya
>

Re: Queries on YARN

Posted by Vinod Kumar Vavilapalli <vi...@apache.org>.
Hi Sabiya,

> 1. Can we define our custom logical resources in YARN other than Memory &
> cpu core?  Is this flexibility there?

There is an ongoing effort to satisfy this very requirement - you can follow https://issues.apache.org/jira/browse/YARN-3926 <https://issues.apache.org/jira/browse/YARN-3926>.

> 2. What is a role of HDFS in yarn, Can yarn work without HDFS?

YARN depends on a FileSystem (of which HDFS is an implementation) for two things
 (a) to find out where the application dependent artifacts (like jars, configurations, binaries etc) are present and downloadable to individual nodes from
 (b) to determine a location where it can upload application logs once an application finishes.

HTH
+Vinod