You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@devlake.apache.org by "warren830 (via GitHub)" <gi...@apache.org> on 2023/05/23 12:49:14 UTC

[GitHub] [incubator-devlake] warren830 opened a new issue, #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

warren830 opened a new issue, #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/incubator-devlake/issues?q=is%3Aissue) and found no similar feature requirement.
   
   
   ### Use case
   
   As we have `_devlake_collector_latest_state` to indicate the newest updated data, we can also use this to let extractor and convertor to support incremental updates
   
   ### Description
   
   before we executing extractor and convertor, we can use `_devlake_collector_latest_state.latest_success_start` to filter data
   
   ### Related issues
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] klesh commented on issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "klesh (via GitHub)" <gi...@apache.org>.
klesh commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1451277881

   I think this is a good idea!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] commented on issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1558255025

   This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] closed issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update
URL: https://github.com/apache/incubator-devlake/issues/4247


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] commented on issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1546774772

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] klesh commented on issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "klesh (via GitHub)" <gi...@apache.org>.
klesh commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1455819161

   However, simply depending on the `_devlake_collector_latest_state.latest_success_start` is not reliable.
   
   ## The following factors must be considered:
   
   1. The `extractors` and `converters` are working under the `delete and insert` principle without any knowledge of its preceding `subtasks`
   2. Users might collect data multiple times without `extraction` or `conversion` with the current design.
   3. The relationship between `collectors`, `extractors`, and `converters` are **NOT** 1:1:1. remember that some `subtasks` might produce multiple kinds of records, for example, the `jira issue extractor` produces `issues` and `changelog` and others. vice-verse, a set of records of a specific scope might come from multiple upstream `subtasks`, for example, `changelog` could come from `issue collector` or `changelog collector`. The dependency could be quite messy if we depend only on the `_devlake_collector_latest_state`
   
   In summary, it is hard and unreliable to distinguish `Incremental` and `FullRefresh` by examining the state of the preceding `collector`.
   
   ## Proposal
   
   I think it would easier for us to track the state of  `subtasks` (`extractors` and `converters`) without introducing dependency.
   
   1. Each `extractor` or `converter` should have its own state
   2. The `ExtractorHelper` and `ConverterHelper` should support the `IsIncremental` option and determine whether to delete the existing records or not, just like the `CollectorHelper`. However, the `IsIncremental` conditions are different.
   3. The `IsIncremental` for  `ExtractorHelper` and `ConverterHelper`  can be done by simply comparing its `latest_success_start` with the `max(created_at)` (the `created_at` represents the time of the record being created in the database)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] commented on issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1619286689

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] commented on issue #4247: [Feature][Framework] Let extractor and convertor to support increment update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1436145025

   This issue has been automatically marked as stale because it has not had recent activity for 30 days. It will be closed in next 7 days if no further activity occurs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] commented on issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on issue #4247:
URL: https://github.com/apache/incubator-devlake/issues/4247#issuecomment-1633363512

   This issue has been closed because it has not received response for too long time. You could reopen it if you encountered similar problems in the future.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-devlake] github-actions[bot] closed issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] closed issue #4247: [Feature][Framework] Allow extractor and convertor to support incremental update
URL: https://github.com/apache/incubator-devlake/issues/4247


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@devlake.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org