You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@thrift.apache.org by Jeff Brown <je...@gmail.com> on 2010/03/02 11:58:05 UTC

Trees in Thrift.

So...
I've been trolling the archives and it seems that supporting recursive data
structures in Thrift is a popular and contentious topic.

Let me see if I understand the main arguments

Pros:
- Recursive data structures (trees) are extremely common and natural.
- This is super easy to implement in just about any language that has
pointers.

Cons:
- The C++ API would need to change to handle recursive types.
- It is possible to emulate recursive data structures other ways.

Red Herrings:
- Forward references make things hard.  (But we don't need these for trees.
Sucks if you want to model a graph though.)
- We would need smart pointers for reference counting, etc.  (No we don't,
object ownership is trivial in a tree.  The parent owns its children.
Standard pointers can be used.  Just make sure that all copies are deep
copies.)

So to be fair, all I want are trees.  I don't need graphs.  I don't want to
think or talk about graphs in general and the interesting problems they
present.  Let's focus on trees.

Ok, so for all language bindings that already use aggregation to represent
the nested components of a Thrift struct, I strongly doubt any significant
changes will be required.

Unfortunately for language bindings that use composition instead of
aggregation (C++ only?), recursive data structures won't work unless we
change a few things.

a. One way to think about this dilemma is that there could be two C++
language bindings.  The "classic" version would use composed structs only
and the "new" version would use aggregated structs (maybe only when
necessary).  The user would choose to generate code according to one pattern
or the other.  (yuck)

b. Another way to think about it is that perhaps the IDL author should be in
control of when pointers are used.  To that end, we would add a new type to
Thrift like "ref<T>" that would either translate into a smart pointer with
strict ownership (like auto_ptr<T>) or into an ordinary C++ pointer and some
appropriate logic in the struct's destructor and copy-constructor.

c. And another way is to notice that the Thrift "list<T>" type generates
std::vector<T> which already internally holds a pointer.  So without any IDL
changes at all, we could just lift the restriction on recursive data
structures to say that self-referencing types can only be used in Thrift
list<T>, set<T> and map<K, V> fields.  This should all "just work" in all
current language bindings.

Bottom line, I think the restriction on recursive data structures can be
lifted, it's just a question of implementation tradeoffs.

My vote would be for 3c.  Just permit recursive data structures involving
list<T>, set<T> and map<K, V> fields.  We can argue about 3b another day.
:-)

Jeff.