You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by James Pettyjohn <ja...@scientology.net> on 2011/05/24 19:42:43 UTC

Modeling suggestions


I am planning out a central database for contact information, invoices
and a bunch of other domain specific information that well be coming from
hundreds geographically disparate locations. With the requirements of
having every change ever made kept forever I wanted this in Hadoop/HBASE
but am not sure the best architecture for the problem. 

The same person
will exist in maybe a dozen or more of these locations, so both the view of
that person from all locations must be visible and a unified view will also
be needed.

Additionally every change (incluse of updates/deletes/inserts)
is to be recorded permanently. This would be needed for every records for
random read use, you could pull up an address history for the person and
you could also see the history of that person for every location we have
data from for him. Additionally I would want to be able to take all the
changes from the beginning of records to some arbitrary year and 'playback'
the updates, or something to this effect, to have a database that was
exactly what we had at that time.

Usual usage would be the primary contact
information and facts about him (e.g. - he attended this event, got this
kind of training etc.). Next to that you would have invoices, notes, his
data as it is from a specific location and historical records both for the
unified view and based the based on the local data.

The major things I
thought would include a Flume style storage of all updates into files in
HDFS for the playback style and cooking it into HBase at the same
time.

How it's stored in HBase I get a little murky, I had been planning
on a column family per major dataset - e.g. invoices, notes, local data
sets etc. I know the docs it suggest against more than 2 families so I
don't know this one.

Does anyone have any tips or places that do a similar
thing?

Best, James