You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yubin Li (Jira)" <ji...@apache.org> on 2022/04/07 09:11:00 UTC

[jira] [Comment Edited] (FLINK-27100) Support parquet format in FileStore

    [ https://issues.apache.org/jira/browse/FLINK-27100?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17518713#comment-17518713 ] 

Yubin Li edited comment on FLINK-27100 at 4/7/22 9:10 AM:
----------------------------------------------------------

[~lzljs3620320] Thanks for your attention, I have configured file.format, and encountered the following exception:
{code:java}
Could not find any factories that implement 'org.apache.flink.table.store.shaded.org.apache.flink.connector.file.table.factories.BulkReaderFormatFactory' in the classpath.
    at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:526) {code}
when I look into classpath, shaded SPI indeed not exists, maybe we can wrap Parquet format like ORC, solve the first problem as you figured.


was (Author: liyubin117):
[~lzljs3620320] Thanks for your attention, I have configured file.format, and encountered the following exception:
{code:java}
Could not find any factories that implement 'org.apache.flink.table.store.shaded.org.apache.flink.connector.file.table.factories.BulkReaderFormatFactory' in the classpath.
    at org.apache.flink.table.factories.FactoryUtil.discoverFactory(FactoryUtil.java:526) {code}
when I look into the uber jar of flink-table-store, SPI indeed not exists, maybe we can wrap Parquet format like ORC, solve the first problem as you figured.

> Support parquet format in FileStore
> -----------------------------------
>
>                 Key: FLINK-27100
>                 URL: https://issues.apache.org/jira/browse/FLINK-27100
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table Store
>            Reporter: Yubin Li
>            Priority: Major
>
> Apache Parquet is a very popular columnar file format, used in many data analysis engines like Hive/Impala/Spark/Flink. we could use simple command lines like parquet-tools to view metadata and data easily instead of using complex java code.
> now flink-table-store only support ORC, but there are massive business data stored as Parquet format, developers/analysisers are very familliar with it, and Parquet has better support for impala engine.
>  maybe it's a good addition to make Parquet usable. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)