You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@poi.apache.org by Nick Burch <ni...@torchbox.com> on 2005/07/31 17:10:01 UTC

Design question - how to hold metadata on record types

Hi All

I've been having a debate on this with Yegor Kozlov, who's done some
powerpoint around adding new sheets that he hopes to be able to contribute
to POI in the near future. We can't agree on it, so I'm hoping that some
of the other developers might be able to weigh in with their ideas.

The debate is about how we store the metadata on the different kinds of
records. Currently, for HSLF, the metadata we have is three things: Record
ID (number), Record Name, HSLF class that handles the record.


My preferred solution (and hence the one currently in the codebase) is to
hold all of this metadata in one single place (hslf.record.RecordType). If
there is a single line holding all the metadata for a kind of record, then
there's no duplication, and a much lower chance of mistakes /
inconsistencies creeping in. If you want to look up a ID->Class or
Name->ID mapping, then there's only one class you need to deal with.

I'd say the plus points are:
* single class to deal with for all metadata
* no duplication
* reduced chances of inconsistencies in metadata
And the minus point is:
* not how other bits of POI do it


Yegor's preferred solution is to split the metadata up. He would prefere
to store the ID <--> Name mapping in one file (eg RecordType), and the
ID <--> Class mapping in another (eg RecordFactory), as other bits of POI
seem to. My main concern is that we now have duplication of the metadata
(two files have the ID list in them), and this greatly increases the
chances of them going out of sync, and then containing differing /
contradictory information.

The plus point seems to be:
* What other bits of POI have done
And the minus points are:
* Metadata is split into multiple files
* Bits of the metadata are duplicated
* Risk of two files going out of sync and differning from each other
* If we wanted to store a 4th bit of metadata, we'd probably end up with a
  third file


So, can anyone offer any suggestions on which route they think is best?
Obviously, I prefere the one were all the metadata stays in a single
place. However, this hasn't been the POI way, so what caused the
difference in thinking?

Cheers
Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Re: Design question - how to hold metadata on record types

Posted by Nick Burch <ni...@torchbox.com>.
On Sun, 31 Jul 2005 acoliver@apache.org wrote:
> However, on its face, this seems to be a minor bikesheddish argument of 
> "what should the class be named" -- generally such discussions have the 
> same bewildering conclusion as "is Coke better than Pepsi".

Great. I thought it might be, but I was worried that there was something 
important that I'd missed in my scheme!


I'll give Yegor a prod, and see how he's getting on with building some 
more patches to HLSF of his work

Nick

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/


Re: Design question - how to hold metadata on record types

Posted by ac...@apache.org.
Do both simultaneously and let Darwin decide or let two reasonable 
people come to a conclusion later.  That is the POI way.

However, on its face, this seems to be a minor bikesheddish argument of 
"what should the class be named" -- generally such discussions have the 
same bewildering conclusion as "is Coke better than Pepsi".

As a general principle I favor:

DRY - Dont repeat yourself
Avoid brittle couplings - Note that record factory sticks everything in 
a map -- you do have to add things in two places [record, 
record-to-map], but I never found a nicer solution that didn't have 
horrible drawbacks)
Avoid Mega Do Everything classes - for a good example of this see 
classes like SMTPHandler in JAMES vs CmdHELO, CmdAUTH, CmdData... in 
JBoss Mail Server or the big fat glibole2 (granted its C but the idea 
holds) vs POIFS.
Unit tests galore
Documentation like Quick Guide in HSSF

However I want to focus you on things that matter substantially and let 
the details fall to Darwin:

To leave scratchpad you'll need "round tripping" (read to write 
support), lots of unit tests, documentation, javadoc, normalization 
(don't have your own org.apache.poi.util duplicates) and "community" or 
completeness.

This last one is a soft designation, look at HSSF, if someone leaves -- 
we stay alive because multiple people have touched the code.  For 
another model look at POIFS -- not a whole lot of more work can be done. 
  Meaning if it is done you don't need completeness.

1. What needs to be done to achieve this?
2. Does this code bring you closer?
3. Why didn't this discussion take place on the list?  Did I miss 
something?  Are the patches in bugzilla (ref no?)

-Andy

Nick Burch wrote:
> Hi All
> 
> I've been having a debate on this with Yegor Kozlov, who's done some
> powerpoint around adding new sheets that he hopes to be able to contribute
> to POI in the near future. We can't agree on it, so I'm hoping that some
> of the other developers might be able to weigh in with their ideas.
> 
> The debate is about how we store the metadata on the different kinds of
> records. Currently, for HSLF, the metadata we have is three things: Record
> ID (number), Record Name, HSLF class that handles the record.
> 
> 
> My preferred solution (and hence the one currently in the codebase) is to
> hold all of this metadata in one single place (hslf.record.RecordType). If
> there is a single line holding all the metadata for a kind of record, then
> there's no duplication, and a much lower chance of mistakes /
> inconsistencies creeping in. If you want to look up a ID->Class or
> Name->ID mapping, then there's only one class you need to deal with.
> 
> I'd say the plus points are:
> * single class to deal with for all metadata
> * no duplication
> * reduced chances of inconsistencies in metadata
> And the minus point is:
> * not how other bits of POI do it
> 
> 
> Yegor's preferred solution is to split the metadata up. He would prefere
> to store the ID <--> Name mapping in one file (eg RecordType), and the
> ID <--> Class mapping in another (eg RecordFactory), as other bits of POI
> seem to. My main concern is that we now have duplication of the metadata
> (two files have the ID list in them), and this greatly increases the
> chances of them going out of sync, and then containing differing /
> contradictory information.
> 
> The plus point seems to be:
> * What other bits of POI have done
> And the minus points are:
> * Metadata is split into multiple files
> * Bits of the metadata are duplicated
> * Risk of two files going out of sync and differning from each other
> * If we wanted to store a 4th bit of metadata, we'd probably end up with a
>   third file
> 
> 
> So, can anyone offer any suggestions on which route they think is best?
> Obviously, I prefere the one were all the metadata stays in a single
> place. However, this hasn't been the POI way, so what caused the
> difference in thinking?
> 
> Cheers
> Nick
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
> Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
> The Apache Jakarta POI Project: http://jakarta.apache.org/poi/
> .
> 


-- 
Andrew C. Oliver
SuperLink Software, Inc.

Java to Excel using POI
http://www.superlinksoftware.com/services/poi
Commercial support including features added/implemented, bugs fixed.

---------------------------------------------------------------------
To unsubscribe, e-mail: poi-dev-unsubscribe@jakarta.apache.org
Mailing List:    http://jakarta.apache.org/site/mail2.html#poi
The Apache Jakarta POI Project: http://jakarta.apache.org/poi/