You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@mnemonic.apache.org by Gary <ga...@apache.org> on 2016/08/17 18:02:17 UTC

Discussion about Columnar layout of data

Hi,

Basically, we have completed the durable native computing infra. since the last release, so it might be a good time to think about how to introduce the Columnar layout of data into Apache Mnemonic, the reason are

1) to take advantage of the SIMD

2) to optimize the use of CPU caches

3) to compact tensor data

I figure out the following approaches for your consideration.

a) to build a set of durable columnar collections for data (using MemBuffer or MemChunk as underlying storage)

b) to let durable native computing service layout columnar data as needed (possible to gather scattered data using AVX instructions without moving data)

Please offer advice, Thanks.

Cheers
+Gary

Re: Discussion about Columnar layout of data

Posted by Gary <ga...@apache.org>.

Hi Yanping,

I think if we adopt solution b) there would be not overlap with Apache Arrow, otherwise I propose to have own durable columnar collections if it is not possible for Apache Arrow team to integrate our project, Thanks.

B.R,
+Gary


On 8/21/2016 11:24 AM, Yanping Wang wrote:
> Hi, Gary
>
> is this a joined project with Arrow? or you plan to build own columnar
> collection?
>
> Thanks
> yanping
>
> On Wed, Aug 17, 2016 at 11:02 AM, Gary <ga...@apache.org> wrote:
>
>> Hi,
>>
>> Basically, we have completed the durable native computing infra. since the
>> last release, so it might be a good time to think about how to introduce
>> the Columnar layout of data into Apache Mnemonic, the reason are
>>
>> 1) to take advantage of the SIMD
>>
>> 2) to optimize the use of CPU caches
>>
>> 3) to compact tensor data
>>
>> I figure out the following approaches for your consideration.
>>
>> a) to build a set of durable columnar collections for data (using
>> MemBuffer or MemChunk as underlying storage)
>>
>> b) to let durable native computing service layout columnar data as needed
>> (possible to gather scattered data using AVX instructions without moving
>> data)
>>
>> Please offer advice, Thanks.
>>
>> Cheers
>> +Gary
>>
>>
>>
>>

Re: Discussion about Columnar layout of data

Posted by Yanping Wang <yw...@gmail.com>.

Hi, Gary

is this a joined project with Arrow? or you plan to build own columnar
collection?

Thanks
yanping

On Wed, Aug 17, 2016 at 11:02 AM, Gary <ga...@apache.org> wrote:

> Hi,
>
> Basically, we have completed the durable native computing infra. since the
> last release, so it might be a good time to think about how to introduce
> the Columnar layout of data into Apache Mnemonic, the reason are
>
> 1) to take advantage of the SIMD
>
> 2) to optimize the use of CPU caches
>
> 3) to compact tensor data
>
> I figure out the following approaches for your consideration.
>
> a) to build a set of durable columnar collections for data (using
> MemBuffer or MemChunk as underlying storage)
>
> b) to let durable native computing service layout columnar data as needed
> (possible to gather scattered data using AVX instructions without moving
> data)
>
> Please offer advice, Thanks.
>
> Cheers
> +Gary
>
>
>
>