You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Andreas Rieger <an...@gmail.com> on 2010/11/15 23:08:50 UTC

modelling and presentation of related data (using flask-couchdb)

Hi all,

i am currently working on my first CouchDB-project and could use some
advice regarding the presentation
and modelling of my data.
I am using Flask with Flask-CouchDB and what i am basically
struggeling with right now is the display
of related data in a list view/template.

Let's say i have companies, their branches in different cities and
employees working in these branches.
I split my data, as advised in the wiki
(http://wiki.apache.org/couchdb/EntityRelationship), into separate
documents (1:n) so that i have to join them later with a view.

Company
-------
_id
company_id 1
name Company-A

Branch
------
_id
branch_id 1
company 1
name Branch-A

Employee
--------
_id
branch 1
name

Though i assume using the _ids to relate the documents would be
cleaner it's basically a foreign-key like setup.

Or would it be better fitting to couchdb's overall approach at this
point to setup the references from the other end?

Like so ...

Branch
------
_id
name
employees = [1,2,3]

The first scheme seems better to me, as i can create and relate new
employees without need to edit the related "master" document, the
branch in this case, for each new employee.
On the the other hand techniques like "linked documents"
(http://wiki.apache.org/couchdb/Introduction_to_CouchDB_views#Keys_and_values)
 seem to presume the second modelling of data ...


Okay ...  with the advices from the "Definitive Guide" concerning
virtual documents and this blog post
(http://www.cmlenz.net/archives/2007/10/couchdb-joins) i managed to
show all my related data (using my described scheme) in a single view.

Company1
BranchA
BranchB
Company2
BranchC

If i iterate over the data in a template i surely can display
companies and branches in conjunction but as i get the data
as rows i can't count the branches or display and adject additional
info for a single row.

As the wiki says:
"This is a little bit like a JOIN in SQL although in SQL the data
fields would be joined together on a row where here they are on
consecutive rows.
 This latter approach allows a variable number of data fields which is
more flexible than SQL.

But as i mentioned, what if i want to display extra information per
row in a template? Let's say the total count of branches? Or a list of
all employees?
I don't see how i can e.g. create a sublist of branches for a company
in a template using the view result, at least i can't hide the
branch-sublist for companies with no branches, as a row doesn't know
if there are further rows that represent related branches or not.

Having some experience with Django i basically miss the option to
create model methods that return related data for a given instance,
which i could call per row
Or am i am missing/misunderstanding something fundamental here?
I understand that the idea of side-effect free views somehow contrasts
my approach but then again i wonder if there's a pragmatic way to
achieve my goal as i assume it is not too exotic?


What i finally realized is that something django-like is possible
using the Document Class provided by Flask-CouchDB.

class Company(Document):

    doc_type = TextField(default="company")
    name = TextField()
    company_id = TextField()

    def get_locations(self):
        """ query and return associated locations """
        return list_locations[self.cid]

(list_locations is a view which returns all locations ordered by _id.)

Okay, with this it's trivial to display associated data for Document.
At least in a detail view of Company, where i use python-couchdb's
document mapping (Company.load(id)).

But when it comes to a list view which iterates over all
Company-Documents it's probably not very elegant
to call a subquery for every item. Especially as i have to pipe the
CouchDB-Viewresults (rows) through the model-mapping
to gain this option([Company.load(c.id) for c in list_companies()],
where list_companies() is a view that returns all companies by id].

This is probably a killer in regards to performance. But then again
it's the most practical solution i could find.

So ... as i am sure that there's a better and cleaner way to do this i
am looking forward to some suggestions.

Greetings,
Andreas