You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Edmon Begoli <eb...@gmail.com> on 2016/03/02 19:21:33 UTC

Intel CPU architecture

Hey folks,

How could I get more details on what and how Arrow uses Intel CPUs for
whatever computational advantage?

At JICS, we run very large experimental Intel HPC systems, and I would like
to learn how can we possibly run some interesting Arrow on Intel CPUs
experiments.

Thank you,
Edmon

Re: Intel CPU architecture

Posted by Edmon Begoli <eb...@gmail.com>.
Yes (about JICS/NICS)

 There are many platforms here.

Beacon is probably better because it is a large memory machine, and it can
be allocated more easily:

https://www.nics.tennessee.edu/beacon

I also have authority to commit 1000s of hours of Beacon to the Arrow
project, if needed.

Titan is little bit harder to access. It will soon be replaced with Summit
anyway.

On Wednesday, March 2, 2016, Venkat Krishnamurthy <ni...@gmail.com>
wrote:

> Is JICS the Joint Institute for Comp Sciences at ORNL/UT? If so, is one of
> the target platforms Titan@ORNL?
>
> On Wed, Mar 2, 2016 at 2:56 PM, Edmon Begoli <ebegoli@gmail.com
> <javascript:;>> wrote:
>
> > Would you guys be interested in perhaps having a Hangout with my team
> from
> > JICS/NICS?
> >
> > We have some major experts and research thrusts in this area (code
> > optimizations for Intel chipsets, MKL and other kernels, memory/IO
> > optimizations, etc)
> >
> > We are a research shop. People just get excited over things like this.
> >
> >
> > On Wednesday, March 2, 2016, Wes McKinney <wes@cloudera.com
> <javascript:;>> wrote:
> >
> > > hi Edmon,
> > >
> > > Since Arrow arrays are arranged with like-data in contiguous memory
> > > regions (for example, in an array of strings, the UTF8 bytes are all
> > > laid out in contiguous memory -- see
> > > https://github.com/apache/arrow/blob/master/format/Layout.md), it is
> > > cache-friendly for scan operations and amenable to SIMD computations
> > > (for example: SIMD-accelerated hash functions). This is especially
> > > important for nested data, as all the "leaf nodes" in a nested
> > > structure generally contain contiguous memory.
> > >
> > > We have not started doing this yet, but it would be useful to begin
> > > assembling kernels that use CPU intrinsics (and SSE/AVX) in the Arrow
> > > codebase, and to make them easily accessible. Having a standard
> > > benchmark suite and other performance experimentation tools available
> > > for users to run on their hardware would also be great.
> > >
> > > best,
> > > Wes
> > >
> > > On Wed, Mar 2, 2016 at 10:21 AM, Edmon Begoli <ebegoli@gmail.com
> <javascript:;>
> > > <javascript:;>> wrote:
> > > > Hey folks,
> > > >
> > > > How could I get more details on what and how Arrow uses Intel CPUs
> for
> > > > whatever computational advantage?
> > > >
> > > > At JICS, we run very large experimental Intel HPC systems, and I
> would
> > > like
> > > > to learn how can we possibly run some interesting Arrow on Intel CPUs
> > > > experiments.
> > > >
> > > > Thank you,
> > > > Edmon
> > >
> >
>

Re: Intel CPU architecture

Posted by Venkat Krishnamurthy <ni...@gmail.com>.
Is JICS the Joint Institute for Comp Sciences at ORNL/UT? If so, is one of
the target platforms Titan@ORNL?

On Wed, Mar 2, 2016 at 2:56 PM, Edmon Begoli <eb...@gmail.com> wrote:

> Would you guys be interested in perhaps having a Hangout with my team from
> JICS/NICS?
>
> We have some major experts and research thrusts in this area (code
> optimizations for Intel chipsets, MKL and other kernels, memory/IO
> optimizations, etc)
>
> We are a research shop. People just get excited over things like this.
>
>
> On Wednesday, March 2, 2016, Wes McKinney <we...@cloudera.com> wrote:
>
> > hi Edmon,
> >
> > Since Arrow arrays are arranged with like-data in contiguous memory
> > regions (for example, in an array of strings, the UTF8 bytes are all
> > laid out in contiguous memory -- see
> > https://github.com/apache/arrow/blob/master/format/Layout.md), it is
> > cache-friendly for scan operations and amenable to SIMD computations
> > (for example: SIMD-accelerated hash functions). This is especially
> > important for nested data, as all the "leaf nodes" in a nested
> > structure generally contain contiguous memory.
> >
> > We have not started doing this yet, but it would be useful to begin
> > assembling kernels that use CPU intrinsics (and SSE/AVX) in the Arrow
> > codebase, and to make them easily accessible. Having a standard
> > benchmark suite and other performance experimentation tools available
> > for users to run on their hardware would also be great.
> >
> > best,
> > Wes
> >
> > On Wed, Mar 2, 2016 at 10:21 AM, Edmon Begoli <ebegoli@gmail.com
> > <javascript:;>> wrote:
> > > Hey folks,
> > >
> > > How could I get more details on what and how Arrow uses Intel CPUs for
> > > whatever computational advantage?
> > >
> > > At JICS, we run very large experimental Intel HPC systems, and I would
> > like
> > > to learn how can we possibly run some interesting Arrow on Intel CPUs
> > > experiments.
> > >
> > > Thank you,
> > > Edmon
> >
>

Re: Intel CPU architecture

Posted by Edmon Begoli <eb...@gmail.com>.
Would you guys be interested in perhaps having a Hangout with my team from
JICS/NICS?

We have some major experts and research thrusts in this area (code
optimizations for Intel chipsets, MKL and other kernels, memory/IO
optimizations, etc)

We are a research shop. People just get excited over things like this.


On Wednesday, March 2, 2016, Wes McKinney <we...@cloudera.com> wrote:

> hi Edmon,
>
> Since Arrow arrays are arranged with like-data in contiguous memory
> regions (for example, in an array of strings, the UTF8 bytes are all
> laid out in contiguous memory -- see
> https://github.com/apache/arrow/blob/master/format/Layout.md), it is
> cache-friendly for scan operations and amenable to SIMD computations
> (for example: SIMD-accelerated hash functions). This is especially
> important for nested data, as all the "leaf nodes" in a nested
> structure generally contain contiguous memory.
>
> We have not started doing this yet, but it would be useful to begin
> assembling kernels that use CPU intrinsics (and SSE/AVX) in the Arrow
> codebase, and to make them easily accessible. Having a standard
> benchmark suite and other performance experimentation tools available
> for users to run on their hardware would also be great.
>
> best,
> Wes
>
> On Wed, Mar 2, 2016 at 10:21 AM, Edmon Begoli <ebegoli@gmail.com
> <javascript:;>> wrote:
> > Hey folks,
> >
> > How could I get more details on what and how Arrow uses Intel CPUs for
> > whatever computational advantage?
> >
> > At JICS, we run very large experimental Intel HPC systems, and I would
> like
> > to learn how can we possibly run some interesting Arrow on Intel CPUs
> > experiments.
> >
> > Thank you,
> > Edmon
>

Re: Intel CPU architecture

Posted by Wes McKinney <we...@cloudera.com>.
hi Edmon,

Since Arrow arrays are arranged with like-data in contiguous memory
regions (for example, in an array of strings, the UTF8 bytes are all
laid out in contiguous memory -- see
https://github.com/apache/arrow/blob/master/format/Layout.md), it is
cache-friendly for scan operations and amenable to SIMD computations
(for example: SIMD-accelerated hash functions). This is especially
important for nested data, as all the "leaf nodes" in a nested
structure generally contain contiguous memory.

We have not started doing this yet, but it would be useful to begin
assembling kernels that use CPU intrinsics (and SSE/AVX) in the Arrow
codebase, and to make them easily accessible. Having a standard
benchmark suite and other performance experimentation tools available
for users to run on their hardware would also be great.

best,
Wes

On Wed, Mar 2, 2016 at 10:21 AM, Edmon Begoli <eb...@gmail.com> wrote:
> Hey folks,
>
> How could I get more details on what and how Arrow uses Intel CPUs for
> whatever computational advantage?
>
> At JICS, we run very large experimental Intel HPC systems, and I would like
> to learn how can we possibly run some interesting Arrow on Intel CPUs
> experiments.
>
> Thank you,
> Edmon