You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by Morbus Iff <mo...@disobey.com> on 2009/07/09 20:09:40 UTC

Defining my document model when the source is entity-relationship

Hello!

I know nothing about CouchDB (woohoo!)

You can all blame nslater for this mail (boo! hiss!)

I don't really have a huge interest in learning CouchDB - it's more of a 
passing "huh", based solely cos nslater keeps talking about the damn 
thing on IRC all the time. But, I'd figure that if I'm going to rib him 
about working on some crazy new technology, I might as well base my 
flaming and puerile hatred on actual facts and usage, yeah? ;)

So, to satisfy said passing "huh", my pet project will be to implement 
FRBR within CouchDB. FRBR is a librarian tech which basically models a 
way to talk about works of creation. It's relatively new in the scheme 
of things (the librarian world moves a lot slower than the internet 
world). The design of FRBR, however, is mostly based around the idea of 
relational databases, which is exactly what CouchDB purportedly isn't.

Right.

I've already, four or five years ago, taken the FRBR spec and converted 
it into a set of MySQL relational tables. The earliest thing I can do 
with CouchDB, however, is thinking about how FRBR fits into the 
"Self-Contained Data" model of /relax/why-couchdb.

To quote from WP:Functional_Requirements_for_Bibliographic_Records:

   Group 1 entities are Work, Expression, Manifestation, and Item (WEMI).
   They represent the products of intellectual or artistic endeavour.

   Group 2 entities are person and corporate body, responsible for
   the custodianship of Group 1’s intellectual or artistic endeavour.

   Group 3 entities are subjects of Group 1 or Group 2’s intellectual
   endeavour, and include concepts, objects, events, places.

There are some swank charts on WP showing this model.

http://en.wikipedia.org/wiki/File:FRBR-Group-1-entities-and-basic-relations.svg
http://en.wikipedia.org/wiki/File:FRBR-Group-2-entities-and-relations.svg

The simplest question, really is:

  * Should all these Group entities be individual docs...
  * ... or should they all be a single document inside CouchDB?

Perhaps I should start out with an example of what FRBR is and isn't.

You own a book called "Morbus Rules". It's signed by Morbus, but your 
dog took a piss on it, so the bottom half is slightly stained. At first, 
you'd say to yourself, well, "hey! that's a document! why, it's just 
like the business card or address book analogy we love to use!"

Right. It is.

But, in FRBR, that simple book is a lot more complex. That simple book 
is an "Item" (your personal copy) of a "Manifestation" (all other books 
that are the same printing from the same publisher) of an "Expression" 
(all versions of this book that share the exact same creative parts) of 
a "Work" (the theoretical hand-waving artistic/creative "feeling", which 
could be expressed as a book, a musical, an interactive DVD, etc.). It 
has various "People" and "Companies" involved (that could change from 
Work or Expression or Item - i.e., you are the Person:Owner of this 
Item, but the Person:Author is always the same of any of these 
Expressions). Concepts, locations, and other tag-like thingies also 
apply to this Manifestation (and potentially, to the Item itself, like 
"dog pissed on it" or the more polite "used").

/me coughs. You, in the back, wake up!

Should FRBR Group 1 entities (the combined mega-Thing of Work, 
Expression, Manifestation, and Item; "WEMI") be a single document within 
CouchDB? Or should they each be their own document which somehow relates 
to all the others?

Things like tags and identifiers (this books ISSN, DOI, ISBN, UPC, etc.) 
I can easily see as being part of the Self-Contained Data of a document. 
But I'm not sure if there should only be one JSON document called 
"Work", with it containing all the other major pieces of a creative 
endeavor, or if it should be four major documents (W, E, M, I) with 
relations to each other.

I don't expect you to understand FRBR, fully I'm just trying to fit a 
design that was specifically made /for/ relationship databases into 
something that was specifically made /not for/ the relational approach.

-- 
Morbus Iff ( tomorrow never comes until it's too late )
Technical: http://www.oreillynet.com/pub/au/779
Enjoy: http://www.disobey.com/ and http://www.videounderbelly.com/
aim: akaMorbus / skype: morbusiff / icq: 2927491 / jabber.org: morbus