You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Kevin Menard <km...@servprise.com> on 2008/01/16 16:43:32 UTC

Enum best practices

Hi all,

I'm looking at a problem that enums look like a good fit for, but I've never
bothered with Cayenne's enum support.  So, I'm looking for some help on best
practices.

What I have is an "orders" table that I would like to store an "eval_status"
value to indicate whether the order is as an evaluation order and at what
stage of the eval it is.

Normally, I decompose everything and would have an "eval_status" table with
various values and set up an FK constraint on an "eval_status_id" column on
the "orders" table.  I don't expect these values to change, but if they do,
I'd like to be able to do a simple UPDATE statement on the one row.

In my code, however, I think it would be really nice to be able to use
expressions such as: if (EvalStatus.CONVERTED == order.getEvalStatus()).  It
seems that this would be much nicer than using a Cayenne DO for the
comparison.

Despite all of this, I don't want to be binding the data model too tightly
to Java.  In the past we've used Python to access the DB, so I'd like to
support that as well as I can.

The best way I've come up with is if the enum maps its values to the PKs of
the entries in "eval_status" table.  I think this would achieve what I'd
like, with the downside that referential integrity may be compromised if I
update one but not the other.

Is this largely what others are doing?  Or, do you just use VARCHAR columns
and not worry about the normalization of the DB?

Thanks,
Kevin


Re: Enum best practices

Posted by Michael Gentry <bl...@gmail.com>.
For type/status fields in the past, I've tended to just store them as
attributes in the table itself instead of have a lookup (with a few
exceptions like US State, etc).  To me, this made it easier to do ad
hoc queries, see the value in the table when using a tool like DBEdit,
and I figured the storage was equal-to or less-than the size of the FK
to access the reference table.  Of course, eliminating a join would
also improve performance.  That's just been my preference -- trying to
find a balance between practical and ideal.

I'm not sure if this example will help you, either.  It worked very
well for me in the past.  I'd like to get it working for Cayenne 3.0,
too (and maybe integrate into core Cayenne?).

http://cwiki.apache.org/CAY/enumerations-example.html

The advantages of the way I did it is it provided strongly typed
setters/getters, you could map to arbitrary values in the DB (strings
and integers -- current Cayenne 3.0 enum support only supports the
position of the enum, starting at 0), you could use the enum constant
in your queries, the enum values would be translated in the SQL logs.

/dev/mrg


On Jan 16, 2008 10:43 AM, Kevin Menard <km...@servprise.com> wrote:
> Hi all,
>
> I'm looking at a problem that enums look like a good fit for, but I've never
> bothered with Cayenne's enum support.  So, I'm looking for some help on best
> practices.
>
> What I have is an "orders" table that I would like to store an "eval_status"
> value to indicate whether the order is as an evaluation order and at what
> stage of the eval it is.
>
> Normally, I decompose everything and would have an "eval_status" table with
> various values and set up an FK constraint on an "eval_status_id" column on
> the "orders" table.  I don't expect these values to change, but if they do,
> I'd like to be able to do a simple UPDATE statement on the one row.
>
> In my code, however, I think it would be really nice to be able to use
> expressions such as: if (EvalStatus.CONVERTED == order.getEvalStatus()).  It
> seems that this would be much nicer than using a Cayenne DO for the
> comparison.
>
> Despite all of this, I don't want to be binding the data model too tightly
> to Java.  In the past we've used Python to access the DB, so I'd like to
> support that as well as I can.
>
> The best way I've come up with is if the enum maps its values to the PKs of
> the entries in "eval_status" table.  I think this would achieve what I'd
> like, with the downside that referential integrity may be compromised if I
> update one but not the other.
>
> Is this largely what others are doing?  Or, do you just use VARCHAR columns
> and not worry about the normalization of the DB?
>
> Thanks,
> Kevin
>
>

Re: Enum best practices

Posted by Michael Gentry <bl...@gmail.com>.
The enumeration example I have on the wiki (which I used with Cayenne
1.2) allowed for human-suitable names that differed from the actual
backend representation and allowed you to use localization for
different presentations (useful in a web application).  I'm not sure
if that would be applicable in your situation, though.  I wasn't even
really promoting that feature for a native Cayenne implementation, but
if there is a need for it ...

/dev/mrg


On Jan 17, 2008 3:20 PM, Kevin Menard <km...@servprise.com> wrote:
> On 1/17/08 9:47 AM, "Michael Gentry" <bl...@gmail.com> wrote:
>
> > Ari, I was going to mention on the other thread (with Kevin) that it
> > would be nice if you could map the enumerated values in Cayenne
> > Modeler.  I believe Andrus even mentioned keeping the mapping in the
> > XML files, too.  The modeler could generate the code for the enums,
> > too, as you said.  This would free the user from having to write the
> > boilerplate code themselves.  I still see needing two types of
> > enumerations -- integer and string based values.  I can't really see
> > the need for floating point values at this time.
>
> Yes, what Ari mentioned sounds nice.
>
> I could see value in "composite" enums as well, though.  I can't think of a
> better term, but they're basically enums with fields that could then be
> mapped to DB columns.
>
> For example, we use FedEx to process our shipments.  The FedEx API has codes
> for each service, but the codes are not really in a suitable form for a
> customer.  So, we store the code and the textual description as two columns
> in the table.  The set of values is largely fixed, but as we offer new
> shipping methods (e.g., we started offering USPS), we added more rows to the
> table.
>
> It'd be nice if there were a way to map the rows in that table as enum
> instances (Java5 or otherwise) automatically.  The code could be
> considerably simplified (I'm a big fan of switching with Java5 enums), but
> maintain the flexibility of a DB backend.
>
> I'm acutely aware that this is likely a pipedream though, given the invasive
> nature of it all.
>
> --
> Kevin
>
>

Re: Enum best practices

Posted by Kevin Menard <km...@servprise.com>.
On 1/17/08 9:47 AM, "Michael Gentry" <bl...@gmail.com> wrote:

> Ari, I was going to mention on the other thread (with Kevin) that it
> would be nice if you could map the enumerated values in Cayenne
> Modeler.  I believe Andrus even mentioned keeping the mapping in the
> XML files, too.  The modeler could generate the code for the enums,
> too, as you said.  This would free the user from having to write the
> boilerplate code themselves.  I still see needing two types of
> enumerations -- integer and string based values.  I can't really see
> the need for floating point values at this time.

Yes, what Ari mentioned sounds nice.

I could see value in "composite" enums as well, though.  I can't think of a
better term, but they're basically enums with fields that could then be
mapped to DB columns.

For example, we use FedEx to process our shipments.  The FedEx API has codes
for each service, but the codes are not really in a suitable form for a
customer.  So, we store the code and the textual description as two columns
in the table.  The set of values is largely fixed, but as we offer new
shipping methods (e.g., we started offering USPS), we added more rows to the
table.

It'd be nice if there were a way to map the rows in that table as enum
instances (Java5 or otherwise) automatically.  The code could be
considerably simplified (I'm a big fan of switching with Java5 enums), but
maintain the flexibility of a DB backend.

I'm acutely aware that this is likely a pipedream though, given the invasive
nature of it all.

-- 
Kevin


Re: Enum best practices

Posted by Michael Gentry <bl...@gmail.com>.
Ari, I was going to mention on the other thread (with Kevin) that it
would be nice if you could map the enumerated values in Cayenne
Modeler.  I believe Andrus even mentioned keeping the mapping in the
XML files, too.  The modeler could generate the code for the enums,
too, as you said.  This would free the user from having to write the
boilerplate code themselves.  I still see needing two types of
enumerations -- integer and string based values.  I can't really see
the need for floating point values at this time.

/dev/mrg

On Jan 16, 2008 6:01 PM, Aristedes Maniatis <ar...@ish.com.au> wrote:
>
> On 17/01/2008, at 9:32 AM, Kevin Menard wrote:
>
> > That's largely what I'm trying to avoid.  I don't want a bunch of
> > integer
> > values that have no meaning without the corresponding Java code.  I
> > suppose
> > SQL comments are one way to go or external documentation, but each
> > has its
> > own set of downsides.
>
> 1. Use VARCHAR instead and use meaningful names: "hold", "cancelled",
> etc. Not an approach I'd take, but an option.
>
> 2. I've been meaning to work on modeler docs for so long now. I've
> even got some code written for this, but not nearly done. [1]
>
> My point is that this enumerated field approach (as opposed to a
> related table) should only be used if the meaning is tied in deeply to
> the code (eg. cancelled invoices are removed from profit and loss
> calculations automatically by the system). So it is inevitable that
> the meaning is embedded within Java. The trick is to make sure it is
> clearly and simply documented. And that you can't assign a value of 10
> to the status field when they only go to 9.
>
> What would be really cool would be the ability within the modeler to
> define an enumeration which is attached to an attribute. You would
> define the values and names. Cayenne would then create the enum class
> with cgen. Then the model describes the complete picture and many more
> people would use enums instead of arbitrary integer values. Right now
> enums are a bit boring to create (I've got one enum attribute in a
> project I work on).
>
>
> [1] https://issues.apache.org/cayenne/browse/CAY-400
>
>
>
> Ari
>
>
> -------------------------->
> ish
> http://www.ish.com.au
> Level 1, 30 Wilson Street Newtown 2042 Australia
> phone +61 2 9550 5001   fax +61 2 9550 4001
> GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A
>
>
>

Re: Enum best practices

Posted by Aristedes Maniatis <ar...@ish.com.au>.
On 17/01/2008, at 9:32 AM, Kevin Menard wrote:

> That's largely what I'm trying to avoid.  I don't want a bunch of  
> integer
> values that have no meaning without the corresponding Java code.  I  
> suppose
> SQL comments are one way to go or external documentation, but each  
> has its
> own set of downsides.

1. Use VARCHAR instead and use meaningful names: "hold", "cancelled",  
etc. Not an approach I'd take, but an option.

2. I've been meaning to work on modeler docs for so long now. I've  
even got some code written for this, but not nearly done. [1]

My point is that this enumerated field approach (as opposed to a  
related table) should only be used if the meaning is tied in deeply to  
the code (eg. cancelled invoices are removed from profit and loss  
calculations automatically by the system). So it is inevitable that  
the meaning is embedded within Java. The trick is to make sure it is  
clearly and simply documented. And that you can't assign a value of 10  
to the status field when they only go to 9.

What would be really cool would be the ability within the modeler to  
define an enumeration which is attached to an attribute. You would  
define the values and names. Cayenne would then create the enum class  
with cgen. Then the model describes the complete picture and many more  
people would use enums instead of arbitrary integer values. Right now  
enums are a bit boring to create (I've got one enum attribute in a  
project I work on).


[1] https://issues.apache.org/cayenne/browse/CAY-400


Ari


-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A



Re: Enum best practices

Posted by Kevin Menard <km...@servprise.com>.
On 1/16/08 3:49 PM, "Aristedes Maniatis" <ar...@ish.com.au> wrote:
 
> I don't quite understand what advantage you get from the above
> approach rather than just creating a table called
> OrderEvaluationStatus and using regular relations?
> 
> For my part, I make a separation between attributes which a user might
> (even theoretically) modify and those which are tied into the
> functioning of the code. If a user might change the data (eg. states,
> postcodes, etc) then it belongs in a separate table. If tied
> instrisically to the code (eg. Invoice.status) then I'd create static
> constants in the Invoice table which point to Integer values for
> database storage. A user shouldn't then be able to create a new
> invoice status even if they have direct access to the database since
> their meaning is heavily tied up in how the code works.
> INVOICE_STATUS_CANCELLED, INVOICE_STATUS_PENDING, INVOICE_STATUS_HOLD
> or whatever.

That's largely what I'm trying to avoid.  I don't want a bunch of integer
values that have no meaning without the corresponding Java code.  I suppose
SQL comments are one way to go or external documentation, but each has its
own set of downsides.

> 
> The main trouble with my approach is that there is nothing stopping a
> programming error like
> 
> invoice.setStatus(PURCHASE_STATUS_HOLD)
> 
> other than the naming convention. Enums should certainly help here.

Correct.  I think they enhance clarity in most cases as well.

> 
> Have I missed the point of what you were asking?
> 

I'm not sure.  I may not have been clear enough to start.  Do my above
comments help elucidate any?

-- 
Kevin


Re: Enum best practices

Posted by Aristedes Maniatis <ar...@ish.com.au>.
On 17/01/2008, at 2:43 AM, Kevin Menard wrote:

> The best way I've come up with is if the enum maps its values to the  
> PKs of
> the entries in "eval_status" table.  I think this would achieve what  
> I'd
> like, with the downside that referential integrity may be  
> compromised if I
> update one but not the other.
>
> Is this largely what others are doing?  Or, do you just use VARCHAR  
> columns
> and not worry about the normalization of the DB?

I don't quite understand what advantage you get from the above  
approach rather than just creating a table called  
OrderEvaluationStatus and using regular relations?

For my part, I make a separation between attributes which a user might  
(even theoretically) modify and those which are tied into the  
functioning of the code. If a user might change the data (eg. states,  
postcodes, etc) then it belongs in a separate table. If tied  
instrisically to the code (eg. Invoice.status) then I'd create static  
constants in the Invoice table which point to Integer values for  
database storage. A user shouldn't then be able to create a new  
invoice status even if they have direct access to the database since  
their meaning is heavily tied up in how the code works.  
INVOICE_STATUS_CANCELLED, INVOICE_STATUS_PENDING, INVOICE_STATUS_HOLD  
or whatever.

The main trouble with my approach is that there is nothing stopping a  
programming error like

invoice.setStatus(PURCHASE_STATUS_HOLD)

other than the naming convention. Enums should certainly help here.


Have I missed the point of what you were asking?

Ari


-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001   fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A