You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@community.apache.org by "Maxim Solodovnik (Jira)" <ji...@apache.org> on 2023/03/13 05:04:00 UTC

[jira] [Updated] (COMDEV-512) [GSoC][Doris] Supports BigQuery/Apache Kudu/Apache Cassandra/Apache Druid in Federated Queries

     [ https://issues.apache.org/jira/browse/COMDEV-512?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Maxim Solodovnik updated COMDEV-512:
------------------------------------
    Labels: Doris full-time gsoc2023 mentor  (was: ApacheDoris full-time gsoc2023 mentor)

> [GSoC][Doris] Supports BigQuery/Apache Kudu/Apache Cassandra/Apache Druid in Federated Queries 
> -----------------------------------------------------------------------------------------------
>
>                 Key: COMDEV-512
>                 URL: https://issues.apache.org/jira/browse/COMDEV-512
>             Project: Community Development
>          Issue Type: Task
>          Components: GSoC/Mentoring ideas
>            Reporter: Zhijing Lu
>            Priority: Major
>              Labels: Doris, full-time, gsoc2023, mentor
>
> *Apache Doris*
> Apache Doris is a real-time analytical database based on MPP architecture. As a unified platform that supports multiple data processing scenarios, it ensures high performance for low-latency and high-throughput queries, allows for easy federated queries on data lakes, and supports various data ingestion methods.
> Page: [https://doris.apache.org|https://doris.apache.org/]
> Github: [https://github.com/apache/doris]
> h3. *Background*
> Apache Doris supports acceleration of queries on external data sources to meet users' needs for federated queries and analysis.
> Currently, Apache Doris supports multiple external catalogs including those from Hive, Iceberg, Hudi, and JDBC. Developers can connect more data sources to Apache Doris based on a unified framework.
> h4. *Objective*
>  * Enable Apache Doris to access one or more of these data sources via the Multi-Catalog feature: BigQuery/Kudu/Cassandra/Druid;
>  * 
> Compile relevant documentation. See an example here: [https://doris.apache.org/docs/dev/lakehouse/multi-catalog/hive]
> *Task*
> {*}Phase One{*}:
>  * Get familiar with the Multi-Catalog structure of Apache Doris, including the metadata synchronization mechanism in FE and the data reading mechanism of BE.
>  * Investigate how metadata should be acquired and how data access works regarding the picked data source(s); produce the corresponding design documentation.
> {*}Phase Two{*}:
>  * Develop connections to the picked data source(s) and implement access to metadata and data.
> h3. *Learning Material*
> {*}Page{*}: [https://doris.apache.org|https://doris.apache.org/]
> {*}Github{*}: [https://github.com/apache/doris]
> h3. Mentor
>  * Mentor: Mingyu Chen, Apache Doris PMC Member & Committer, [chenmingyu@apache.orgĀ |mailto:yangyongqiang@apache.org]
>  * Mentor: Calvin Kirs, Apache Dolphinscheduler PMC & Committer, [CalvinKirs@apache.org|mailto:CalvinKirs@apache.org]
>  * Mailing List: dev@doris.apache.org



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@community.apache.org
For additional commands, e-mail: dev-help@community.apache.org