You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by Dhasharath Shrivathsa <dh...@radix.bio> on 2019/07/28 03:54:21 UTC

Re: Radical idea for 1.9.0

Perhaps I don’t understand you. What is “a recursive record”? AFAIU real
data can not be recursive. (I would love to be shown to be wrong about
this.)

Real data is very recursive, consider the canonical definitons for
List/Tree, and anything with cons/cdr
https://en.wikipedia.org/wiki/Cons

JSON is a recursive record. At any point, a Json object contains a map from
a string to a Json object. The recursion is there for most AST like
descriptions.

Denormalizing this data means that you'd insert parent/child pointers and
emit multiple messages, but this kinda sucks, since you'd end up writing
something like a WITH RECURSIVE SQL query or similar to be able to unmelt
the data back into it's recursive form.

Incedentally, with the stuff in AVRO-530, you can write a transform to take
a true recursive type and turn it into a sorta-recursive Avro type. See my
issue here: https://github.com/sksamuel/avro4s/issues/307 since generalized
to arbitrary fixpoint types.

Instead of straight indexing, to index lower than the toplevel the thing to
use would be a F-Algebra/visitor pattern, as well as binary serialization.
Most data that's not recursive would simply apply the F-Algebra once, but in
the case of recursion, you'd apply it multiple times to generate the binary
serialization, and annotate with something like a coelgot algebra to give
you a toplevel schema.
Between AVRO-530 + AVRO-248, the sketch of what to do is already there.
I'm confident I could do this in Haskell/Scala, but I don't know Java so
can't contribute to Avro.






--
Sent from: http://apache-avro.679487.n3.nabble.com/Avro-Developers-f679485.html