You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Shuai Zheng <sz...@gmail.com> on 2015/03/09 17:25:33 UTC
Read Parquet file from scala directly
Hi All,
I have a lot of parquet files, and I try to open them directly instead of
load them into RDD in driver (so I can optimize some performance through
special logic).
But I do some research online and can't find any example to access parquet
directly from scala, anyone has done this before?
Regards,
Shuai
Re: Read Parquet file from scala directly
Posted by Akhil Das <ak...@sigmoidanalytics.com>.
Here's a Java version
https://github.com/cloudera/parquet-examples/tree/master/MapReduce It won't
be that hard to make that in Scala.
Thanks
Best Regards
On Mon, Mar 9, 2015 at 9:55 PM, Shuai Zheng <sz...@gmail.com> wrote:
> Hi All,
>
>
>
> I have a lot of parquet files, and I try to open them directly instead of
> load them into RDD in driver (so I can optimize some performance through
> special logic).
>
> But I do some research online and can’t find any example to access parquet
> directly from scala, anyone has done this before?
>
>
>
> Regards,
>
>
>
> Shuai
>
Re: Read Parquet file from scala directly
Posted by Cheng Lian <li...@gmail.com>.
The parquet-tools code should be pretty helpful (although it's Java)
https://github.com/apache/incubator-parquet-mr/tree/master/parquet-tools/src/main/java/parquet/tools/command
On 3/10/15 12:25 AM, Shuai Zheng wrote:
>
> Hi All,
>
> I have a lot of parquet files, and I try to open them directly instead
> of load them into RDD in driver (so I can optimize some performance
> through special logic).
>
> But I do some research online and can’t find any example to access
> parquet directly from scala, anyone has done this before?
>
> Regards,
>
> Shuai
>