You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@arrow.apache.org by 刘晓臻 <ja...@seu.edu.cn> on 2020/07/10 07:58:37 UTC

How to get started with using the Arrow Java API?

Hi,

I’m a contributor to the Texera (https://github.com/Texera/texera) project, which is an online big data analytics system providing visual and interactive workflow. Current we’re trying to use Arrow in our system (now we’re trying arrow for transferring data between JVM and a python process, but maybe eventually we will adopt Arrow throughout our system).
However, there seem to be very few tutorials regarding the Java API of Arrow, and the documentation provided by Arrow’s official website only has Maven JavaDoc, which is not very informative as a starting point.
So where can I find the kind of documentation like that of C++ and Python API where many examples are shown? I’ve noticed there are a few pages like this for Java (https://arrow.apache.org/docs/java/ipc.html), but these pages are not indexed and can only be discovered by searching. Is this something that is being worked on?

Thank you.

Best,
Xiaozhen Liu

Re: How to get started with using the Arrow Java API?

Posted by Micah Kornfield <em...@gmail.com>.
Hi Xiaozhen Liu,

The "hidden documentation"  should be fixed with the next release of the
website [1].

Thanks,
Micah

[1]
https://issues.apache.org/jira/browse/ARROW-8649?jql=project%20%3D%20ARROW%20AND%20text%20~%20%22java%20hidden%22

On Fri, Jul 10, 2020 at 1:09 AM 刘晓臻 <ja...@seu.edu.cn> wrote:

> Hi,
>
>
>
> I’m a contributor to the Texera (https://github.com/Texera/texera)
> project, which is an online big data analytics system providing visual and
> interactive workflow. Current we’re trying to use Arrow in our system (now
> we’re trying arrow for transferring data between JVM and a python process,
> but maybe eventually we will adopt Arrow throughout our system).
>
> However, there seem to be very few tutorials regarding the Java API of
> Arrow, and the documentation provided by Arrow’s official website only has
> Maven JavaDoc, which is not very informative as a starting point.
>
> So where can I find the kind of documentation like that of C++ and Python
> API where many examples are shown? I’ve noticed there are a few pages like
> this for Java (https://arrow.apache.org/docs/java/ipc.html), but these
> pages are not indexed and can only be discovered by searching. Is this
> something that is being worked on?
>
>
>
> Thank you.
>
>
>
> Best,
>
> Xiaozhen Liu
>

RE: How to get started with using the Arrow Java API?

Posted by Xiaozhen Liu <ja...@seu.edu.cn>.
Hello Joris,

Thanks for your reply!
I’ve actually read your blog post before I asked my question. I think it is very well-written and informative. Fortunately, I was able to get started with the help of your tutorial. Thank you for sharing these!
However, maybe it will be better if there will be some official documentation if I want to explore further. I’m beginning to see the power of Arrow and Arrow Flight in our project, and some official documentation will be of great help if we want to continue to use Arrow throughout our project.
Anyway, thank you so much!

Best,
Xiaozhen

From: Joris Gillis
Sent: Monday, July 20, 2020 4:31 PM
To: user@arrow.apache.org
Subject: Re: How to get started with using the Arrow Java API?

Hi Xiaozhen

Sorry in advance for the self-promotion. 

I had the same issue, hence I wrote down what I figured out about the Java API in this blog post: https://www.infoq.com/articles/apache-arrow-java/

Another blog post can be found here: https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md

And last but not least, there is some more documentation inside the source code: https://github.com/apache/arrow/tree/master/java
In particular: 
- https://arrow.apache.org/docs/java/vector.html
- https://arrow.apache.org/docs/java/vector_schema_root.html
are good starting points for exploring the API.

Best regards
Joris


On 10 Jul 2020, at 09:58, 刘晓臻 <ja...@seu.edu.cn> wrote:

Hi,
 
I’m a contributor to the Texera (https://github.com/Texera/texera) project, which is an online big data analytics system providing visual and interactive workflow. Current we’re trying to use Arrow in our system (now we’re trying arrow for transferring data between JVM and a python process, but maybe eventually we will adopt Arrow throughout our system).
However, there seem to be very few tutorials regarding the Java API of Arrow, and the documentation provided by Arrow’s official website only has Maven JavaDoc, which is not very informative as a starting point.
So where can I find the kind of documentation like that of C++ and Python API where many examples are shown? I’ve noticed there are a few pages like this for Java (https://arrow.apache.org/docs/java/ipc.html), but these pages are not indexed and can only be discovered by searching. Is this something that is being worked on?
 
Thank you.
 
Best,
Xiaozhen Liu



Re: How to get started with using the Arrow Java API?

Posted by Neal Richardson <ne...@gmail.com>.
Hi Joris,
The existing prose Java documentation is in
https://github.com/apache/arrow/tree/master/docs/source/java, so I'd
imagine that's where you'd want to put any additional content.

Neal

On Tue, Jul 21, 2020 at 1:55 AM Joris Gillis <jo...@trendminer.com>
wrote:

> Hi Neal
>
> I would definitely like to help out with the documentation. Let me check
> the copyright agreement with InfoQ.
>
> In the meanwhile, could you point me in the right direction to contribute
> documentation?
>
> Thanks
> Joris
>
> On 20 Jul 2020, at 17:03, Neal Richardson <ne...@gmail.com>
> wrote:
>
> Hi Joris,
> Since you've written up some nice prose documentation for the Java API,
> would you be interested in contributing it, or selections of it, to the
> official project documentation? It looks like it would be a valuable
> addition.
>
> Neal
>
> On Mon, Jul 20, 2020 at 1:31 AM Joris Gillis <jo...@trendminer.com>
> wrote:
>
>> Hi Xiaozhen
>>
>> Sorry in advance for the self-promotion.
>>
>> I had the same issue, hence I wrote down what I figured out about the
>> Java API in this blog post:
>> https://www.infoq.com/articles/apache-arrow-java/
>>
>> Another blog post can be found here:
>> https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md
>>
>> And last but not least, there is some more documentation inside the
>> source code: https://github.com/apache/arrow/tree/master/java
>> In particular:
>> - https://arrow.apache.org/docs/java/vector.html
>> - https://arrow.apache.org/docs/java/vector_schema_root.html
>> are good starting points for exploring the API.
>>
>> Best regards
>> Joris
>>
>> On 10 Jul 2020, at 09:58, 刘晓臻 <ja...@seu.edu.cn> wrote:
>>
>> Hi,
>>
>> I’m a contributor to the Texera (https://github.com/Texera/texera)
>> project, which is an online big data analytics system providing visual and
>> interactive workflow. Current we’re trying to use Arrow in our system (now
>> we’re trying arrow for transferring data between JVM and a python process,
>> but maybe eventually we will adopt Arrow throughout our system).
>> However, there seem to be very few tutorials regarding the Java API of
>> Arrow, and the documentation provided by Arrow’s official website only has
>> Maven JavaDoc, which is not very informative as a starting point.
>> So where can I find the kind of documentation like that of C++ and Python
>> API where many examples are shown? I’ve noticed there are a few pages like
>> this for Java (https://arrow.apache.org/docs/java/ipc.html), but these
>> pages are not indexed and can only be discovered by searching. Is this
>> something that is being worked on?
>>
>> Thank you.
>>
>> Best,
>> Xiaozhen Liu
>>
>>
>>
>

Re: How to get started with using the Arrow Java API?

Posted by Joris Gillis <jo...@trendminer.com>.
Hi Neal

I would definitely like to help out with the documentation. Let me check the copyright agreement with InfoQ. 

In the meanwhile, could you point me in the right direction to contribute documentation?

Thanks
Joris

> On 20 Jul 2020, at 17:03, Neal Richardson <ne...@gmail.com> wrote:
> 
> Hi Joris,
> Since you've written up some nice prose documentation for the Java API, would you be interested in contributing it, or selections of it, to the official project documentation? It looks like it would be a valuable addition.
> 
> Neal
> 
> On Mon, Jul 20, 2020 at 1:31 AM Joris Gillis <joris.gillis@trendminer.com <ma...@trendminer.com>> wrote:
> Hi Xiaozhen
> 
> Sorry in advance for the self-promotion. 
> 
> I had the same issue, hence I wrote down what I figured out about the Java API in this blog post: https://www.infoq.com/articles/apache-arrow-java/ <https://www.infoq.com/articles/apache-arrow-java/>
> 
> Another blog post can be found here: https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md <https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md>
> 
> And last but not least, there is some more documentation inside the source code: https://github.com/apache/arrow/tree/master/java <https://github.com/apache/arrow/tree/master/java>
> In particular: 
> - https://arrow.apache.org/docs/java/vector.html <https://arrow.apache.org/docs/java/vector.html>
> - https://arrow.apache.org/docs/java/vector_schema_root.html <https://arrow.apache.org/docs/java/vector_schema_root.html>
> are good starting points for exploring the API.
> 
> Best regards
> Joris
> 
>> On 10 Jul 2020, at 09:58, 刘晓臻 <jamie@seu.edu.cn <ma...@seu.edu.cn>> wrote:
>> 
>> Hi,
>>  
>> I’m a contributor to the Texera (https://github.com/Texera/texera <https://github.com/Texera/texera>) project, which is an online big data analytics system providing visual and interactive workflow. Current we’re trying to use Arrow in our system (now we’re trying arrow for transferring data between JVM and a python process, but maybe eventually we will adopt Arrow throughout our system).
>> However, there seem to be very few tutorials regarding the Java API of Arrow, and the documentation provided by Arrow’s official website only has Maven JavaDoc, which is not very informative as a starting point.
>> So where can I find the kind of documentation like that of C++ and Python API where many examples are shown? I’ve noticed there are a few pages like this for Java (https://arrow.apache.org/docs/java/ipc.html <https://arrow.apache.org/docs/java/ipc.html>), but these pages are not indexed and can only be discovered by searching. Is this something that is being worked on?
>>  
>> Thank you.
>>  
>> Best,
>> Xiaozhen Liu
> 


Re: How to get started with using the Arrow Java API?

Posted by Neal Richardson <ne...@gmail.com>.
Hi Joris,
Since you've written up some nice prose documentation for the Java API,
would you be interested in contributing it, or selections of it, to the
official project documentation? It looks like it would be a valuable
addition.

Neal

On Mon, Jul 20, 2020 at 1:31 AM Joris Gillis <jo...@trendminer.com>
wrote:

> Hi Xiaozhen
>
> Sorry in advance for the self-promotion.
>
> I had the same issue, hence I wrote down what I figured out about the Java
> API in this blog post: https://www.infoq.com/articles/apache-arrow-java/
>
> Another blog post can be found here:
> https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md
>
> And last but not least, there is some more documentation inside the source
> code: https://github.com/apache/arrow/tree/master/java
> In particular:
> - https://arrow.apache.org/docs/java/vector.html
> - https://arrow.apache.org/docs/java/vector_schema_root.html
> are good starting points for exploring the API.
>
> Best regards
> Joris
>
> On 10 Jul 2020, at 09:58, 刘晓臻 <ja...@seu.edu.cn> wrote:
>
> Hi,
>
> I’m a contributor to the Texera (https://github.com/Texera/texera)
> project, which is an online big data analytics system providing visual and
> interactive workflow. Current we’re trying to use Arrow in our system (now
> we’re trying arrow for transferring data between JVM and a python process,
> but maybe eventually we will adopt Arrow throughout our system).
> However, there seem to be very few tutorials regarding the Java API of
> Arrow, and the documentation provided by Arrow’s official website only has
> Maven JavaDoc, which is not very informative as a starting point.
> So where can I find the kind of documentation like that of C++ and Python
> API where many examples are shown? I’ve noticed there are a few pages like
> this for Java (https://arrow.apache.org/docs/java/ipc.html), but these
> pages are not indexed and can only be discovered by searching. Is this
> something that is being worked on?
>
> Thank you.
>
> Best,
> Xiaozhen Liu
>
>
>

Re: How to get started with using the Arrow Java API?

Posted by Joris Gillis <jo...@trendminer.com>.
Hi Xiaozhen

Sorry in advance for the self-promotion. 

I had the same issue, hence I wrote down what I figured out about the Java API in this blog post: https://www.infoq.com/articles/apache-arrow-java/ <https://www.infoq.com/articles/apache-arrow-java/>

Another blog post can be found here: https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md <https://github.com/animeshtrivedi/blog/blob/master/post/2017-12-26-arrow.md>

And last but not least, there is some more documentation inside the source code: https://github.com/apache/arrow/tree/master/java <https://github.com/apache/arrow/tree/master/java>
In particular: 
- https://arrow.apache.org/docs/java/vector.html <https://arrow.apache.org/docs/java/vector.html>
- https://arrow.apache.org/docs/java/vector_schema_root.html <https://arrow.apache.org/docs/java/vector_schema_root.html>
are good starting points for exploring the API.

Best regards
Joris

> On 10 Jul 2020, at 09:58, 刘晓臻 <ja...@seu.edu.cn> wrote:
> 
> Hi,
>  
> I’m a contributor to the Texera (https://github.com/Texera/texera <https://github.com/Texera/texera>) project, which is an online big data analytics system providing visual and interactive workflow. Current we’re trying to use Arrow in our system (now we’re trying arrow for transferring data between JVM and a python process, but maybe eventually we will adopt Arrow throughout our system).
> However, there seem to be very few tutorials regarding the Java API of Arrow, and the documentation provided by Arrow’s official website only has Maven JavaDoc, which is not very informative as a starting point.
> So where can I find the kind of documentation like that of C++ and Python API where many examples are shown? I’ve noticed there are a few pages like this for Java (https://arrow.apache.org/docs/java/ipc.html <https://arrow.apache.org/docs/java/ipc.html>), but these pages are not indexed and can only be discovered by searching. Is this something that is being worked on?
>  
> Thank you.
>  
> Best,
> Xiaozhen Liu