You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/05/13 15:19:16 UTC

[GitHub] [hudi] whitecloud6688 opened a new issue, #5579: [SUPPORT]增加一个获取增量数据的函数

whitecloud6688 opened a new issue, #5579:
URL: https://github.com/apache/hudi/issues/5579

   **Describe the problem you faced**
   
   在做数据仓库工作时,经常为获取一个表的增量数据而苦恼。例如 1时 从 A表 --> B表,2时需要读取A表的增量数据,将增量写入B表。目前没有很简便的方法获取增量数据。
   希望能够设计一个函数,该函数的功能是获取表的上一次增量值,例如:
   SELECT 增量函数(_hoodie_commit_time) FROM A;  将获取A表_hoodie_commit_time字段的上一次增量值;返回值:12
   SELECT 增量函数(_hoodie_commit_seqno) FROM A;  将获取A表_hoodie_commit_seqno字段的上一次增量值;
   
   _hoodie_commit_time
   11   # 1时 同步任务
   12   # 1时 同步任务
   21   # 2时 同步任务需要获取的增量
   22   # 2时 同步任务需要获取的增量
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1.SELECT 增量函数(_hoodie_commit_time) FROM A;  将获取A表_hoodie_commit_time字段的上一次增量值;
   2.SELECT 增量函数(_hoodie_commit_seqno) FROM A;  将获取A表_hoodie_commit_seqno字段的上一次增量值;
   
   
   **Expected behavior**
   SELECT 增量函数(_hoodie_commit_time) FROM A;  将获取A表_hoodie_commit_time字段的上一次增量值;返回值:12
   SELECT 增量函数(_hoodie_commit_seqno) FROM A;  将获取A表_hoodie_commit_seqno字段的上一次增量值;
   
   
   **Environment Description**
   
   * Hudi version :0.11.0
   
   * Spark version :3.2.1
   
   * Hive version :2.3.1
   
   * Hadoop version :2.10.1
   
   * Storage (HDFS/S3/GCS..) :
   
   * Running on Docker? (yes/no) :no
   
   
   **Additional context**
   
   
   
   **Stacktrace**
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] whitecloud6688 closed issue #5579: [SUPPORT]增加一个获取增量数据的函数

Posted by GitBox <gi...@apache.org>.
whitecloud6688 closed issue #5579: [SUPPORT]增加一个获取增量数据的函数
URL: https://github.com/apache/hudi/issues/5579


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] whitecloud6688 commented on issue #5579: [SUPPORT]增加一个获取增量数据的函数

Posted by GitBox <gi...@apache.org>.
whitecloud6688 commented on issue #5579:
URL: https://github.com/apache/hudi/issues/5579#issuecomment-1126636765

   可以的,非常感谢,看来得多研究官网文档啊 @cdmikechen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] cdmikechen commented on issue #5579: [SUPPORT]增加一个获取增量数据的函数

Posted by GitBox <gi...@apache.org>.
cdmikechen commented on issue #5579:
URL: https://github.com/apache/hudi/issues/5579#issuecomment-1126180556

   Hi~ Can this chapter solve your problem? If not, how do you do it now? 
   https://hudi.apache.org/docs/querying_data#spark-incr-query


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org