You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@couchdb.apache.org by Apache Wiki <wi...@apache.org> on 2009/04/12 11:44:18 UTC

[Couchdb Wiki] Update of "EntityRelationship" by WoutMertens

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Couchdb Wiki" for change notification.

The following page has been changed by WoutMertens:
http://wiki.apache.org/couchdb/EntityRelationship

New page:
= Modeling Entity Relationships in CouchDB =

This page is mostly a translation of http://code.google.com/appengine/articles/modeling.html in CouchDB terms.

== Why would I need entity relationships? ==
Imagine you are building a snazzy new web application that includes an address book where users can store their contacts. For each contact the user stores, you want to capture the contacts name, birthday (which they mustn't forget!) their address, telephone number and company they work for.
When the user wants to add an address, they enter the information in to a form and the form saves the information in a model that looks something like this:
{{{
{
  "id":"some unique string that is assigned to the contact",
  "type":"contact",
  "name":"contact's name",
  "birth_day":"a date in string form",
  "address":"the address in string form (like 1600 Ampitheater Pkwy., Mountain View, CA)",
  "phone_number":"phone number in string form",
  "company_title":"company title",
  "company_name":"name of the company",
  "company_description":"some explanation about the company",
  "company_address":"the company address in string form"
}
}}}
(Note that ''type'' doesn't mean anything to CouchDB, we're just using it here for our own convenience. ''id'' is the only thing CouchDB looks at)

That's great, your users immediately begin to use their address book and soon the datastore starts to fill up. Not long after the deployment of your new application you hear from someone that they are not happy that there is only one phone number. What if they want to store someone's work telephone number in addition to their home number? No problem you think, you can just add a work phone number to your structure. You change your data structure to look more like this:
{{{
  "phone_number":"home phone in string form",
  "work_phone_number":"work phone in string form",
}}}
Update the form with the new field and you are back in business. Soon after redeploying your application, you get a number of new complaints. When they see the new phone number field, people start asking for even more fields. Some people want a fax number field, others want a mobile field. Some people even want more than one mobile field (boy modern life sure is hectic)! You could add another field for fax, and another for mobile, maybe two. What about if people have three mobile phones? What if they have ten? What if someone invents a phone for a place you've never thought of?
Your model needs to use relationships.

== One to Many ==
The answer is to allow users to assign as many phone numbers to each of their contacts as they like.

In CouchDB, there are 2 ways to achieve this.
 1. Use separate documents
 2. Use an embedded array

== Separate One to Many ==

When using separate documents, you could have documents like this for the phone numbers:
{{{
{
  "id":"the phone number",
  "type":"phone",
  "contact":"id of the contact document that has this phone number",
  "phone_type":"string describing type of phone, like home,work,fax,mobile,..."
}
}}}
(Note the use of the ''id'' field to store the phone number. Phone numbers are unique (when prefixed with country and area code) and therefore this makes a great ''natural key'')

The key to making all this work is the contact property. By storing the contact id in it, you can refer to the owning contact in a unique way, since ''id'' fields are unique in CouchDB databases.
= This is as far as I got, work in progress =
Creating the relationship between a contact and one of its phone numbers is easy to do. Let's say you have a contact named "Scott" who has a home phone and a mobile phone. You populate his contact info like this:
scott = Contact(name='Scott')
scott.put()
PhoneNumber(contact=scott,
            phone_type='home',
            number='(650) 555 - 2200').put()
PhoneNumber(contact=scott,
            phone_type='mobile',
            number='(650) 555 - 2201').put()
Because ReferenceProperty creates this special property on Contact, it makes it very easy to retrieve all the phone numbers for a given person. If you wanted to print all the phone numbers for a given person, you can do it like this:
print 'Content-Type: text/html'
print
for phone in scott.phone_numbers: 
  print '%s: %s' % (phone.phone_type, phone.number)
This will produce results that look like:
home: (650) 555 - 2200
mobile: (650) 555 - 2201
Note: The order of the output might be different as by default there is no ordering in this kind of relationship.
The phone_numbers virtual attribute is a Query instance, meaning that you can use it to further narrow down and sort the collection associated with the Contact. For example, if you only want to get the home phone numbers, you can do this:
scott.phone_numbers.filter('phone_type =', 'home')
When Scott loses his phone, it's easy enough to delete that record. Just delete the PhoneNumber instance and it can no longer be queried for:
scott.phone_numbers.filter('phone_type =', 'home').get().delete()

== Embedded One to Many ==

The embedded array is only an option as long as you don't have "too many" items to store, since each document is always handled as a whole and bigger documents mean slower handling and network transfers. Phone numbers should be ok unless you plan to store the whole company phonebook in there.