You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crail.apache.org by ZhaoJP <zh...@gmail.com> on 2018/11/19 02:45:11 UTC

Crail on support GPU memory status?

Hi, Crail
this's a very interesting project, I just have a few questions on GPU
memory support related to Heterogeneity.
1) Currently how does Crail handle or exploit GPU memory as one of tier? I
saw a diagram showing GPU/GPUDirect, however, when exploring source codes
including Crail/DiSNI/jNVMf etc, I didn't find any codes related to
GPU/GPUDirect/CUDA etc. so curious  support  GPU memory as tier status or
any plan in roadmap?
2) a specific question is, Does Crail support P2P from NVMe to GPU (or vise
visa)?
3) in another introduction page (
https://crail.apache.org/overview/index.html#fs), it gives an high level
description as below:
"For instance, an application may use the Crail GPU tier to store data. In
that case, sorting can be pushed to the GPU, rather than fetching the data
into main memory and sorting it on the CPU. In other cases, the application
may know the data types in advance and use the information to simplify
sorting (e.g. use Radix sort instead TimSort). "
looks to me, that likely needs a GPU edition of sorting algorithm (such as
via CUDA) to process data at GPU (where data resides), question is who is
providing such GPU edition of sorting? I didn't see Crail did that from
current codes, or is it by Spark or 3rd libraries?

spark.crail.shuffle.serializer
spark.crail.shuffle.sorter


Thanks a lot
------------------------------------------
zhaojp@gmail.com

Re: Crail on support GPU memory status?

Posted by Patrick Stuedi <ps...@gmail.com>.
Hi,

[https://issues.apache.org/jira/browse/CRAIL-86]

GPU integration with Crail is an important item on our TODO list and
has been on the roadmap since a while. There are two (maybe more) ways
GPUs could be integrated in Crail:

a) as a GPU tier: here, the memories of individual GPUs in the cluster
form a distributed storage tier in Crail that can be used to store
data, e.g., by explicitly selecting the GPU tier as a preferred tier
when creating a file/object/etc. We haven't started with the
implementation of this option yet, though.

b) as a client: in this option, we allow applications to access data
on any storage tier (NVMf, RDMA) and read it directly into the memory
of the GPU at the client. This is particularly interesting for
distributed machine learning training. Such a client-driven GPU
integration we are currently working on a implementation as part of
the native Crail client implementation (C++) which we plan to merge
into Crail master sometimes next year.

Both options (a) and (b) may use GPU direct RDMA to transfer data from
the network to the GPU in a P2P manner.

As for your questions about the sorting and GPUs. You're right, such a
sorter won't be part of Apache Crail which only provides raw storage
services. A GPU sorter could however be provide as a Spark library to
accelerate Spark shuffling in combination with a Crail GPU storage
tier. Similarly as Crail-Spark-IO provides optimized sorters for Spark
shuffling (https://github.com/zrlio/crail-spark-io).

-Patrick

On Mon, Nov 19, 2018 at 5:39 AM ZhaoJP <zh...@gmail.com> wrote:
>
> Hi, Crail
> this's a very interesting project, I just have a few questions on GPU
> memory support related to Heterogeneity.
> 1) Currently how does Crail handle or exploit GPU memory as one of tier? I
> saw a diagram showing GPU/GPUDirect, however, when exploring source codes
> including Crail/DiSNI/jNVMf etc, I didn't find any codes related to
> GPU/GPUDirect/CUDA etc. so curious  support  GPU memory as tier status or
> any plan in roadmap?
> 2) a specific question is, Does Crail support P2P from NVMe to GPU (or vise
> visa)?
> 3) in another introduction page (
> https://crail.apache.org/overview/index.html#fs), it gives an high level
> description as below:
> "For instance, an application may use the Crail GPU tier to store data. In
> that case, sorting can be pushed to the GPU, rather than fetching the data
> into main memory and sorting it on the CPU. In other cases, the application
> may know the data types in advance and use the information to simplify
> sorting (e.g. use Radix sort instead TimSort). "
> looks to me, that likely needs a GPU edition of sorting algorithm (such as
> via CUDA) to process data at GPU (where data resides), question is who is
> providing such GPU edition of sorting? I didn't see Crail did that from
> current codes, or is it by Spark or 3rd libraries?
>
> spark.crail.shuffle.serializer
> spark.crail.shuffle.sorter
>
>
> Thanks a lot
> ------------------------------------------
> zhaojp@gmail.com