You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Todd <bi...@163.com> on 2016/05/20 02:16:17 UTC

Does spark support Apache Arrow

From the official site http://arrow.apache.org/, Apache Arrow is used for Columnar In-Memory storage. I have two quick questions:
1. Does spark support Apache Arrow?
2. When dataframe is cached in memory, the data are saved in columnar in-memory style. What is the relationship between this feature and Apache Arrow,that is,
when the data is in Apache Arrow format,does spark still need the effort to cache the dataframe in columnar in-memory?

Re: Does spark support Apache Arrow

Posted by Nirav Patel <np...@xactlycorp.com>.
Kwon,

Isn't that JIRA is part of integration with Arrow.  As far as arrow as
in-memory store goes it probably conflicts with spark's own tungsten memory
representation, right?

Thanks
Nir

On Thu, May 19, 2016 at 8:03 PM, Hyukjin Kwon <gu...@gmail.com> wrote:

> FYI, there is a JIRA for this, https://issues.apache.
> org/jira/browse/SPARK-13534
>
> I hope this link is helpful.
>
> Thanks!
>
>
> 2016-05-20 11:18 GMT+09:00 Sun Rui <su...@163.com>:
>
>> 1. I don’t think so
>> 2. Arrow is for in-memory columnar execution. While cache is for
>> in-memory columnar storage
>>
>> On May 20, 2016, at 10:16, Todd <bi...@163.com> wrote:
>>
>> From the official site http://arrow.apache.org/, Apache Arrow is used
>> for Columnar In-Memory storage. I have two quick questions:
>> 1. Does spark support Apache Arrow?
>> 2. When dataframe is cached in memory, the data are saved in columnar
>> in-memory style. What is the relationship between this feature and Apache
>> Arrow,that is,
>> when the data is in Apache Arrow format,does spark still need the effort
>> to cache the dataframe in columnar in-memory?
>>
>>
>>
>

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>

Re: Does spark support Apache Arrow

Posted by Hyukjin Kwon <gu...@gmail.com>.
FYI, there is a JIRA for this,
https://issues.apache.org/jira/browse/SPARK-13534

I hope this link is helpful.

Thanks!


2016-05-20 11:18 GMT+09:00 Sun Rui <su...@163.com>:

> 1. I don’t think so
> 2. Arrow is for in-memory columnar execution. While cache is for in-memory
> columnar storage
>
> On May 20, 2016, at 10:16, Todd <bi...@163.com> wrote:
>
> From the official site http://arrow.apache.org/, Apache Arrow is used for
> Columnar In-Memory storage. I have two quick questions:
> 1. Does spark support Apache Arrow?
> 2. When dataframe is cached in memory, the data are saved in columnar
> in-memory style. What is the relationship between this feature and Apache
> Arrow,that is,
> when the data is in Apache Arrow format,does spark still need the effort
> to cache the dataframe in columnar in-memory?
>
>
>

Re: Does spark support Apache Arrow

Posted by Sun Rui <su...@163.com>.
1. I don’t think so
2. Arrow is for in-memory columnar execution. While cache is for in-memory columnar storage
> On May 20, 2016, at 10:16, Todd <bi...@163.com> wrote:
> 
> From the official site http://arrow.apache.org/, Apache Arrow is used for Columnar In-Memory storage. I have two quick questions:
> 1. Does spark support Apache Arrow?
> 2. When dataframe is cached in memory, the data are saved in columnar in-memory style. What is the relationship between this feature and Apache Arrow,that is,
> when the data is in Apache Arrow format,does spark still need the effort to cache the dataframe in columnar in-memory?