You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by "Kinsella, Shane" <Sh...@Aspect.com> on 2015/10/23 17:15:01 UTC

How does Spark coordinate with Tachyon wrt data locality

Hi all,

I am looking into how Spark handles data locality wrt Tachyon. My main concern is how this is coordinated. Will it send a task based on a file loaded from Tachyon to a node that it knows has that file locally and how does it know which nodes has what?

Kind regards,
Shane
This email (including any attachments) is proprietary to Aspect Software, Inc. and may contain information that is confidential. If you have received this message in error, please do not read, copy or forward this message. Please notify the sender immediately, delete it from your system and destroy any copies. You may not further disclose or distribute this email or its attachments.

Re: How does Spark coordinate with Tachyon wrt data locality

Posted by Calvin Jia <ji...@gmail.com>.
Hi Shane,

Tachyon provides an api to get the block locations of the file which Spark
uses when scheduling tasks.

Hope this helps,
Calvin

On Fri, Oct 23, 2015 at 8:15 AM, Kinsella, Shane <Sh...@aspect.com>
wrote:

> Hi all,
>
>
>
> I am looking into how Spark handles data locality wrt Tachyon. My main
> concern is how this is coordinated. Will it send a task based on a file
> loaded from Tachyon to a node that it knows has that file locally and how
> does it know which nodes has what?
>
>
>
> Kind regards,
>
> Shane
> This email (including any attachments) is proprietary to Aspect Software,
> Inc. and may contain information that is confidential. If you have received
> this message in error, please do not read, copy or forward this message.
> Please notify the sender immediately, delete it from your system and
> destroy any copies. You may not further disclose or distribute this email
> or its attachments.
>