You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by Andre <an...@fucs.org> on 2017/07/05 13:00:12 UTC

Filesystem based Schema Registry, does it make sense?

dev,

As I continue to explore the Record based processors I got myself wondering:

Does it make sense to have a file-system based schema registry?

Idea would be creating something like AvroSchemaRegistry but instead of the
adding each schema as a controller service property, we would have a
property pointing to a directory.

Each avsc file within that directory would then be validated with the root
"name" within the Avro schema used as the schema name (i.e. the equivalent
to AvroSchemaRegistry property name).

The rationale is that while the Hortonworks and Avro Schema Registries
work, I reckon one is sort of overkill for edge/DMZ NiFi deployments and
the other is painful to update in case of multiple NiFi clusters.

Having a file based registry with inotify or something of sort would be
great for the folks already using external configuration management.


What do you think?

Re: Filesystem based Schema Registry, does it make sense?

Posted by Andre <an...@fucs.org>.
Joe,

At least in my case, I would be happy to control versioning outside nifi
(eg via git).

I would suspect that by adopting an approach like this versioning would be
handled pretty much like the AvroSchemaRegistry?

I assumed that with AvroSchemaRegistry you either name your schema with
versioning in mind (ie mySchemav1)  or you need to overwrite the schema
upon change? Am I missing something?

Cheers

On 6 Jul 2017 1:59 AM, "Joe Witt" <jo...@gmail.com> wrote:

I think it does make sense and someone at a meetup asked a similar
question.  There are some things to be considered like how does one
annotate the version of a schema, the name, etc.. when all they are
providing are files in a directory?  How can they support multiple versions
of a given schema (or maybe they just dont in this approach)?  But there is
no question that being able to just push an avsc file into a directory and
then have it be useable in the flow could be helpful.

On Jul 5, 2017 9:00 AM, "Andre" <an...@fucs.org> wrote:

dev,

As I continue to explore the Record based processors I got myself wondering:

Does it make sense to have a file-system based schema registry?

Idea would be creating something like AvroSchemaRegistry but instead of the
adding each schema as a controller service property, we would have a
property pointing to a directory.

Each avsc file within that directory would then be validated with the root
"name" within the Avro schema used as the schema name (i.e. the equivalent
to AvroSchemaRegistry property name).

The rationale is that while the Hortonworks and Avro Schema Registries
work, I reckon one is sort of overkill for edge/DMZ NiFi deployments and
the other is painful to update in case of multiple NiFi clusters.

Having a file based registry with inotify or something of sort would be
great for the folks already using external configuration management.


What do you think?

Re: Filesystem based Schema Registry, does it make sense?

Posted by Joe Witt <jo...@gmail.com>.
I think it does make sense and someone at a meetup asked a similar
question.  There are some things to be considered like how does one
annotate the version of a schema, the name, etc.. when all they are
providing are files in a directory?  How can they support multiple versions
of a given schema (or maybe they just dont in this approach)?  But there is
no question that being able to just push an avsc file into a directory and
then have it be useable in the flow could be helpful.

On Jul 5, 2017 9:00 AM, "Andre" <an...@fucs.org> wrote:

dev,

As I continue to explore the Record based processors I got myself wondering:

Does it make sense to have a file-system based schema registry?

Idea would be creating something like AvroSchemaRegistry but instead of the
adding each schema as a controller service property, we would have a
property pointing to a directory.

Each avsc file within that directory would then be validated with the root
"name" within the Avro schema used as the schema name (i.e. the equivalent
to AvroSchemaRegistry property name).

The rationale is that while the Hortonworks and Avro Schema Registries
work, I reckon one is sort of overkill for edge/DMZ NiFi deployments and
the other is painful to update in case of multiple NiFi clusters.

Having a file based registry with inotify or something of sort would be
great for the folks already using external configuration management.


What do you think?