You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Paul Taylor <pt...@gmail.com> on 2019/07/24 20:07:03 UTC

Building on Arrow CUDA

I'm looking at options to replace the custom Arrow logic in cuDF with 
Arrow library calls. What's the recommended way to declare a dependency 
on pyarrow/arrowcpp with CUDA support?

I see in the docs it says to build from source, but that's only an 
option for an (advanced) end-user. And building/vendoring 
libarrow_cuda.so isn't a great option for a non-Arrow library, because 
someone who does source build Arrow-with-cuda will conflict with the 
version we ship.

Right now we're considering statically linking libarrow_cuda into 
libcudf.so and vendoring Arrow's cuda cython alongside ours, but this 
increases compile times/library size.

Is there a package management solution (like pip/conda install 
pyarrow[cuda]) that I'm missing? If not, should there be?

Best,

Paul


Re: Building on Arrow CUDA

Posted by "Uwe L. Korn" <uw...@xhochy.com>.
Hello Paul,

you might want to look into https://github.com/conda-forge/conda-forge.github.io/issues/687 where CUDA support on conda-forge is dicussed. I'm not uptodate anymore on this but reading the whole issue should give you the current level of support. Once this is solved, adding cuda support to the Arrow packages on conda-forge should be really simple (but this issue is the major hurdle).

Cheers
Uwe

On Thu, Jul 25, 2019, at 3:54 PM, Wes McKinney wrote:
> hi Paul,
> 
> On Wed, Jul 24, 2019 at 3:07 PM Paul Taylor <pt...@gmail.com> wrote:
> >
> > I'm looking at options to replace the custom Arrow logic in cuDF with
> > Arrow library calls. What's the recommended way to declare a dependency
> > on pyarrow/arrowcpp with CUDA support?
> >
> 
> Well, for conda or wheel packages, we are not shipping with the CUDA
> extensions enabled yet. So if you want to depend on one of those, you
> will have to change that. My understanding is that it's possible to
> build CUDA-enabled packages in conda-forge -- that would probably be
> your best bet. Does anyone know examples of such packages that are
> CUDA-enabled?
> 
> > I see in the docs it says to build from source, but that's only an
> > option for an (advanced) end-user. And building/vendoring
> > libarrow_cuda.so isn't a great option for a non-Arrow library, because
> > someone who does source build Arrow-with-cuda will conflict with the
> > version we ship.
> >
> > Right now we're considering statically linking libarrow_cuda into
> > libcudf.so and vendoring Arrow's cuda cython alongside ours, but this
> > increases compile times/library size.
> >
> > Is there a package management solution (like pip/conda install
> > pyarrow[cuda]) that I'm missing? If not, should there be?
> >
> 
> You can submit pull requests to
> 
> * https://github.com/conda-forge/arrow-cpp-feedstock
> * https://github.com/conda-forge/pyarrow-feedstock
> 
> conda-forge itself can provide guidance at
> https://gitter.im/conda-forge/conda-forge.github.io
> 
> > Best,
> >
> > Paul
> >
>

Re: Building on Arrow CUDA

Posted by Wes McKinney <we...@gmail.com>.
hi Paul,

On Wed, Jul 24, 2019 at 3:07 PM Paul Taylor <pt...@gmail.com> wrote:
>
> I'm looking at options to replace the custom Arrow logic in cuDF with
> Arrow library calls. What's the recommended way to declare a dependency
> on pyarrow/arrowcpp with CUDA support?
>

Well, for conda or wheel packages, we are not shipping with the CUDA
extensions enabled yet. So if you want to depend on one of those, you
will have to change that. My understanding is that it's possible to
build CUDA-enabled packages in conda-forge -- that would probably be
your best bet. Does anyone know examples of such packages that are
CUDA-enabled?

> I see in the docs it says to build from source, but that's only an
> option for an (advanced) end-user. And building/vendoring
> libarrow_cuda.so isn't a great option for a non-Arrow library, because
> someone who does source build Arrow-with-cuda will conflict with the
> version we ship.
>
> Right now we're considering statically linking libarrow_cuda into
> libcudf.so and vendoring Arrow's cuda cython alongside ours, but this
> increases compile times/library size.
>
> Is there a package management solution (like pip/conda install
> pyarrow[cuda]) that I'm missing? If not, should there be?
>

You can submit pull requests to

* https://github.com/conda-forge/arrow-cpp-feedstock
* https://github.com/conda-forge/pyarrow-feedstock

conda-forge itself can provide guidance at
https://gitter.im/conda-forge/conda-forge.github.io

> Best,
>
> Paul
>