You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nifi.apache.org by dcave <dc...@ssglimited.com> on 2015/12/02 18:39:59 UTC

Re: Common data exchange formats and tabular data

Adding multiple input and output format support would complicate the
usability and ongoing maintenance of the SQL/NoSQL processors. 
Additionally, as you suggested it is impossible to select a "correct" format
or set of formats that can handle all potential needs.

A simpler and more streamlined solution is to put the emphasis on having
Convert processors available that can handle specific cases as they come up
as your last comment suggested.  This also keeps processor focus on one
specific task rather than having Get/Put/Convert hybrids that can lead to
unneeded complexity and code bloat.

This is in line with Benjamin's line of work.



--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Common-data-exchange-formats-and-tabular-data-tp3508p5551.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Common data exchange formats and tabular data

Posted by Joe Witt <jo...@gmail.com>.
Toivo,

At a framework level NiFi itself is format/schema agnostic.  It holds
maps of strings (attributes) and a chunk of zero or more bytes
(payload) on a per flowfile basis.  Those bytes could be anything.  In
looking at the majority of processors they exist to deal with
protocols of exchange between systems and generally are also type
agnostic.  In looking at others we see things that work with JSON,
XML, Avro, CSV, text files, etc..  These processors inherently are
meant to deal with those formats.

Now, with the inclusion of processors that pull from and send to
databases we've hit the need for a way to serialize that data while it
lives in NiFi.  We also need these for things like writing to Accumulo
or HBase and presumably other systems as well.  These are systems for
which the 'structure' of the data is in many ways controlled by their
model (mutations, rows, etc...) So, for these I definitely see what
you mean about us centering on some recommended and fully tooled
formats.  The devil of course is always in the schema details.

When pulling from Solr, or a database, or pushing to accumulo or hbase
or other such systems I do think we can/should find a standard.

Joe

On Sat, Jan 2, 2016 at 5:03 AM, Toivo Adams <to...@gmail.com> wrote:
> I was occupied with other things lately and didn't have time to deal with
> this.
>
> In my opinion clear statement which formats are recommended by NiFi helps a
> lot newcomers and processor writers. Also it helps to create Lego like
> pieces which fits together without incompatibility worries.
>
> In ideal world I'd like have one or two “correct data exchange format”.
> But reality is different.
> Different users have different needs and habits, so there is a need for many
> formats.
>
> I prefer to have few 'recommended core formats' which processors should
> support in the way or the other.
> And Convert processors for all other formats.
>
> But I am not sure how to reach consensus.
>
>
> Thanks
> Toivo
>
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Common-data-exchange-formats-and-tabular-data-tp3508p6009.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Common data exchange formats and tabular data

Posted by Toivo Adams <to...@gmail.com>.
I was occupied with other things lately and didn't have time to deal with
this.

In my opinion clear statement which formats are recommended by NiFi helps a
lot newcomers and processor writers. Also it helps to create Lego like
pieces which fits together without incompatibility worries.

In ideal world I'd like have one or two “correct data exchange format”.
But reality is different.
Different users have different needs and habits, so there is a need for many
formats.

I prefer to have few 'recommended core formats' which processors should
support in the way or the other.
And Convert processors for all other formats.

But I am not sure how to reach consensus.


Thanks
Toivo




--
View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Common-data-exchange-formats-and-tabular-data-tp3508p6009.html
Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.

Re: Common data exchange formats and tabular data

Posted by Joe Witt <jo...@gmail.com>.
Toivo - this thread seems important and does not appear to have come
to a resolution.  Do you want to pick this back up or are you
comfortable with where it is as for now?

On Wed, Dec 2, 2015 at 12:39 PM, dcave <dc...@ssglimited.com> wrote:
> Adding multiple input and output format support would complicate the
> usability and ongoing maintenance of the SQL/NoSQL processors.
> Additionally, as you suggested it is impossible to select a "correct" format
> or set of formats that can handle all potential needs.
>
> A simpler and more streamlined solution is to put the emphasis on having
> Convert processors available that can handle specific cases as they come up
> as your last comment suggested.  This also keeps processor focus on one
> specific task rather than having Get/Put/Convert hybrids that can lead to
> unneeded complexity and code bloat.
>
> This is in line with Benjamin's line of work.
>
>
>
> --
> View this message in context: http://apache-nifi-developer-list.39713.n7.nabble.com/Common-data-exchange-formats-and-tabular-data-tp3508p5551.html
> Sent from the Apache NiFi Developer List mailing list archive at Nabble.com.