You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@seatunnel.apache.org by "cqutwangyu (via GitHub)" <gi...@apache.org> on 2024/02/29 05:49:33 UTC

[I] [Feature][seatunnel] Data flow audit log [seatunnel]

cqutwangyu opened a new issue, #6421:
URL: https://github.com/apache/seatunnel/issues/6421

   ### Search before asking
   
   - [X] I had searched in the [feature](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22Feature%22) and found no similar feature requirement.
   
   
   ### Description
   
   在我的使用场景中,会在集群模式下部署多个SeatunnelServer,然后启动上百个SeatunnelClient。
   他们的任务是从source读取消息,通过transform处理后,发送到sink。
   在这个过程中,有大量的数据被处理,但不能完全保证所有数据都能处理成功,所以我需要记录每一条数据的状态。
   我的设想如下:
   1.SourceReader记录原始数据,并记录transform处理的结果(success or error)。
   2.SinkWriter记录transform处理后的输出数据,并记录是否发送成功(success or failure)。
   
   ---
   In my use case, I deploy multiple seatunnelServers in cluster mode and then start hundreds of seatUnnelClients.
   
   Their job is to read messages from the source, process them through transform, and send them to the sink.
   
   During this process, a lot of data is being processed, but there is no guarantee that all of it will be processed successfully, so I need to keep track of the status of each piece of data.
   
   My vision is as follows:
   
   1.SourceReader logs the raw data and records the result of the transform process (success or error).
   
   2.SinkWriter logs the output of the transform and notes whether it was a success or failure.
   
   ### Usage Scenario
   
   1.当Seatunnel任务中出现数据处理异常时,可以通过表记录来审查错误并解决故障。
   2.当下游处理数据出现问题时,可以通过表记录来查询Seatunnel任务处理过程是否正确。
   ---
   1.When the data is processed in the unnel task, it can review errors and solve faults by recording.
   
   2.When the problem of the current processing data appears, it can be used to query whether the unnel task processing is correct.
   
   ### Related issues
   
   pass
   
   ### Are you willing to submit a PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature][seatunnel] Data flow audit log [seatunnel]

Posted by "cqutwangyu (via GitHub)" <gi...@apache.org>.
cqutwangyu commented on issue #6421:
URL: https://github.com/apache/seatunnel/issues/6421#issuecomment-1978316246

   Maybe we can do it based on https://github.com/apache/seatunnel/pull/6419


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature][seatunnel] Data flow audit log [seatunnel]

Posted by "Hisoka-X (via GitHub)" <gi...@apache.org>.
Hisoka-X commented on issue #6421:
URL: https://github.com/apache/seatunnel/issues/6421#issuecomment-1970488658

   This feature is more like a callback function. Users need to expand the content of the callback function themselves to decide to write the results to a database. However, it is difficult to judge whether the data is written successfully because different connectors have different method to realize. Do you have more detailed design ideas to share?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature][seatunnel] Data flow audit log [seatunnel]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #6421:
URL: https://github.com/apache/seatunnel/issues/6421#issuecomment-2038490125

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature][seatunnel] Data flow audit log [seatunnel]

Posted by "cqutwangyu (via GitHub)" <gi...@apache.org>.
cqutwangyu commented on issue #6421:
URL: https://github.com/apache/seatunnel/issues/6421#issuecomment-1970459103

   @Hisoka-X @EricJoy2048 @hailin0 @TyrantLucifer 
   Whether support is available ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] [Feature][seatunnel] Data flow audit log [seatunnel]

Posted by "cqutwangyu (via GitHub)" <gi...@apache.org>.
cqutwangyu commented on issue #6421:
URL: https://github.com/apache/seatunnel/issues/6421#issuecomment-1978264868

   > This feature is more like a callback function. Users need to expand the content of the callback function themselves to decide to write the results to a database. However, it is difficult to judge whether the data is written successfully because different connectors have different method to realize. Do you have more detailed design ideas to share?
   
   At present, I can only give a solution based on my use scenario, such as connector-rocketmq after secondary development, and cannot give a general solution considering multiple data sources for the time being.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@seatunnel.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org