You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/22 06:59:29 UTC

[GitHub] [hudi] vinothchandar commented on a change in pull request #1992: [BLOG] Incremental processing on data lakes by vinoyang

vinothchandar commented on a change in pull request #1992:
URL: https://github.com/apache/hudi/pull/1992#discussion_r475056337



##########
File path: docs/_posts/2020-08-18-hudi-incremental-processing-on-data-lakes.md
##########
@@ -0,0 +1,275 @@
+---
+title: "Incremental Processing on the Data Lake"
+excerpt: "How Apache Hudi provides ability for incremental data processing."
+author: vinoyang
+category: blog
+---
+
+### NOTE: This article is a translation of the infoq.cn article, found [here](https://www.infoq.cn/article/CAgIDpfJBVcJHKJLSbhe), with minor edits
+
+Apache Hudi is a data lake framework which provides the ability to ingest, manage and query large analytical data sets on a distributed file system/cloud stores. 
+Hudi joined the Apache incubator for incubation in January 2019, and was promoted to the top Apache project in May 2020. This article mainly discusses the importance 
+of Hudi to the data lake from the perspective of "incremental processing". More information about Apache Hudi's framework functions, features, usage scenarios, and 
+latest developments can be found at QCon Global Software Development Conference (Beijing Station) 2020.

Review comment:
       @yanghua  do you have any links to this? 
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org