You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Bryan Bende (JIRA)" <ji...@apache.org> on 2019/04/02 14:07:03 UTC

[jira] [Commented] (NIFIREG-242) Two-way synchronization of git repository backed flows

    [ https://issues.apache.org/jira/browse/NIFIREG-242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16807788#comment-16807788 ] 

Bryan Bende commented on NIFIREG-242:
-------------------------------------

[~DennisSeiffert] this sounds like an interesting feature, although I have a few concerns....

The general setup of registry is to be able to store versioned items. Currently the only versioned items are flows, but in the master branch we have added extension bundles like NARs, and then eventually we want to also add assets like datasets and config files. A bucket can contain any number of versioned items and types, so a given bucket could have flows, extensions, assets all together.

The back-end is setup so that the metadata database is the knowledge of all the buckets and which items belong to which buckets, and then each type of items has a persistence provider. For example flows may be stored in Git and extension bundles may be stored in an object store like S3, and really anyone can implement their own persistence provider for any of these because its a pluggable extension point.

Given the above, we can't do things like this:
{code:java}
if (this.flowPersistenceProvider instanceof GitFlowPersistenceProvider){
 deleteAllBucketsInMetaDatabase();
 return createBucketsFromGitProvider();
}{code}
There could be buckets that only have extension bundles stored in them which git wouldn't know about and wouldn't be able to recreate, and there could be buckets with flows and extension bundles in them and we'd lost the knowledge of the extension bundles.

The reason we could implement NIFIREG-209 (rebuild metadata DB from git repo) is because that logic is only triggered when starting a fresh instance with an empty DB, so at that point it is safe to start the DB from what is in the git repo, but once the app is running and there are potentially different types of versioned items in different buckets, the DB has to be the source of truth for what buckets and items exist.

Generally I think we want to avoid people bypassing the application and doing things in git, because then the application becomes tightly coupled to assuming git is present. For example, the description mentioned repairing a broken flow due to changed registry URL. This should be something we support fixing through the application, and I believe there is already a PR open for that (NIFIREG-238). If we need a branching concept (not sure we do) then we should consider building that into the application so that it could work across any FlowPersistenceProvider and not just git.

 

> Two-way synchronization of git repository backed flows
> ------------------------------------------------------
>
>                 Key: NIFIREG-242
>                 URL: https://issues.apache.org/jira/browse/NIFIREG-242
>             Project: NiFi Registry
>          Issue Type: New Feature
>    Affects Versions: 0.4.0
>            Reporter: Dennis Seiffert
>            Priority: Major
>              Labels: git
>
> With this feature the NiFi user and developer's life using git as version control as a backend for the registry would be easier (especially in dockerized environments). As a conclusion the git repository would be the single source of truth in order to maintain NiFi flows. This feature contains the following abilities without affecting existing functionality:
>  * synchronize remote git repository with local (nifi- registry) git repository in order to support multiple registries (imagine changing a flow in a test environment and update the flow in a productive environment via feature branches in git, etc. ) and third party systems (git changes not done by the registry, repair broken flow file because of changed registry url in flow xml)
>  * initial import of git repository into registry's metadata database on startup (see open issue #NIFIREG-227)
>  * ability to reset local git repository (including metadata database) to the state of the remote repository 
>  * get recent status of synchronization process
>  * control synchronization via REST- endpoints (reset repository to initial state, pull latest changes from git remote repository



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)