You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Kevin Menard <km...@servprise.com> on 2008/01/16 16:43:32 UTC
Enum best practices
Hi all,
I'm looking at a problem that enums look like a good fit for, but I've never
bothered with Cayenne's enum support. So, I'm looking for some help on best
practices.
What I have is an "orders" table that I would like to store an "eval_status"
value to indicate whether the order is as an evaluation order and at what
stage of the eval it is.
Normally, I decompose everything and would have an "eval_status" table with
various values and set up an FK constraint on an "eval_status_id" column on
the "orders" table. I don't expect these values to change, but if they do,
I'd like to be able to do a simple UPDATE statement on the one row.
In my code, however, I think it would be really nice to be able to use
expressions such as: if (EvalStatus.CONVERTED == order.getEvalStatus()). It
seems that this would be much nicer than using a Cayenne DO for the
comparison.
Despite all of this, I don't want to be binding the data model too tightly
to Java. In the past we've used Python to access the DB, so I'd like to
support that as well as I can.
The best way I've come up with is if the enum maps its values to the PKs of
the entries in "eval_status" table. I think this would achieve what I'd
like, with the downside that referential integrity may be compromised if I
update one but not the other.
Is this largely what others are doing? Or, do you just use VARCHAR columns
and not worry about the normalization of the DB?
Thanks,
Kevin
Re: Enum best practices
Posted by Michael Gentry <bl...@gmail.com>.
For type/status fields in the past, I've tended to just store them as
attributes in the table itself instead of have a lookup (with a few
exceptions like US State, etc). To me, this made it easier to do ad
hoc queries, see the value in the table when using a tool like DBEdit,
and I figured the storage was equal-to or less-than the size of the FK
to access the reference table. Of course, eliminating a join would
also improve performance. That's just been my preference -- trying to
find a balance between practical and ideal.
I'm not sure if this example will help you, either. It worked very
well for me in the past. I'd like to get it working for Cayenne 3.0,
too (and maybe integrate into core Cayenne?).
http://cwiki.apache.org/CAY/enumerations-example.html
The advantages of the way I did it is it provided strongly typed
setters/getters, you could map to arbitrary values in the DB (strings
and integers -- current Cayenne 3.0 enum support only supports the
position of the enum, starting at 0), you could use the enum constant
in your queries, the enum values would be translated in the SQL logs.
/dev/mrg
On Jan 16, 2008 10:43 AM, Kevin Menard <km...@servprise.com> wrote:
> Hi all,
>
> I'm looking at a problem that enums look like a good fit for, but I've never
> bothered with Cayenne's enum support. So, I'm looking for some help on best
> practices.
>
> What I have is an "orders" table that I would like to store an "eval_status"
> value to indicate whether the order is as an evaluation order and at what
> stage of the eval it is.
>
> Normally, I decompose everything and would have an "eval_status" table with
> various values and set up an FK constraint on an "eval_status_id" column on
> the "orders" table. I don't expect these values to change, but if they do,
> I'd like to be able to do a simple UPDATE statement on the one row.
>
> In my code, however, I think it would be really nice to be able to use
> expressions such as: if (EvalStatus.CONVERTED == order.getEvalStatus()). It
> seems that this would be much nicer than using a Cayenne DO for the
> comparison.
>
> Despite all of this, I don't want to be binding the data model too tightly
> to Java. In the past we've used Python to access the DB, so I'd like to
> support that as well as I can.
>
> The best way I've come up with is if the enum maps its values to the PKs of
> the entries in "eval_status" table. I think this would achieve what I'd
> like, with the downside that referential integrity may be compromised if I
> update one but not the other.
>
> Is this largely what others are doing? Or, do you just use VARCHAR columns
> and not worry about the normalization of the DB?
>
> Thanks,
> Kevin
>
>
Re: Enum best practices
Posted by Michael Gentry <bl...@gmail.com>.
The enumeration example I have on the wiki (which I used with Cayenne
1.2) allowed for human-suitable names that differed from the actual
backend representation and allowed you to use localization for
different presentations (useful in a web application). I'm not sure
if that would be applicable in your situation, though. I wasn't even
really promoting that feature for a native Cayenne implementation, but
if there is a need for it ...
/dev/mrg
On Jan 17, 2008 3:20 PM, Kevin Menard <km...@servprise.com> wrote:
> On 1/17/08 9:47 AM, "Michael Gentry" <bl...@gmail.com> wrote:
>
> > Ari, I was going to mention on the other thread (with Kevin) that it
> > would be nice if you could map the enumerated values in Cayenne
> > Modeler. I believe Andrus even mentioned keeping the mapping in the
> > XML files, too. The modeler could generate the code for the enums,
> > too, as you said. This would free the user from having to write the
> > boilerplate code themselves. I still see needing two types of
> > enumerations -- integer and string based values. I can't really see
> > the need for floating point values at this time.
>
> Yes, what Ari mentioned sounds nice.
>
> I could see value in "composite" enums as well, though. I can't think of a
> better term, but they're basically enums with fields that could then be
> mapped to DB columns.
>
> For example, we use FedEx to process our shipments. The FedEx API has codes
> for each service, but the codes are not really in a suitable form for a
> customer. So, we store the code and the textual description as two columns
> in the table. The set of values is largely fixed, but as we offer new
> shipping methods (e.g., we started offering USPS), we added more rows to the
> table.
>
> It'd be nice if there were a way to map the rows in that table as enum
> instances (Java5 or otherwise) automatically. The code could be
> considerably simplified (I'm a big fan of switching with Java5 enums), but
> maintain the flexibility of a DB backend.
>
> I'm acutely aware that this is likely a pipedream though, given the invasive
> nature of it all.
>
> --
> Kevin
>
>
Re: Enum best practices
Posted by Kevin Menard <km...@servprise.com>.
On 1/17/08 9:47 AM, "Michael Gentry" <bl...@gmail.com> wrote:
> Ari, I was going to mention on the other thread (with Kevin) that it
> would be nice if you could map the enumerated values in Cayenne
> Modeler. I believe Andrus even mentioned keeping the mapping in the
> XML files, too. The modeler could generate the code for the enums,
> too, as you said. This would free the user from having to write the
> boilerplate code themselves. I still see needing two types of
> enumerations -- integer and string based values. I can't really see
> the need for floating point values at this time.
Yes, what Ari mentioned sounds nice.
I could see value in "composite" enums as well, though. I can't think of a
better term, but they're basically enums with fields that could then be
mapped to DB columns.
For example, we use FedEx to process our shipments. The FedEx API has codes
for each service, but the codes are not really in a suitable form for a
customer. So, we store the code and the textual description as two columns
in the table. The set of values is largely fixed, but as we offer new
shipping methods (e.g., we started offering USPS), we added more rows to the
table.
It'd be nice if there were a way to map the rows in that table as enum
instances (Java5 or otherwise) automatically. The code could be
considerably simplified (I'm a big fan of switching with Java5 enums), but
maintain the flexibility of a DB backend.
I'm acutely aware that this is likely a pipedream though, given the invasive
nature of it all.
--
Kevin
Re: Enum best practices
Posted by Michael Gentry <bl...@gmail.com>.
Ari, I was going to mention on the other thread (with Kevin) that it
would be nice if you could map the enumerated values in Cayenne
Modeler. I believe Andrus even mentioned keeping the mapping in the
XML files, too. The modeler could generate the code for the enums,
too, as you said. This would free the user from having to write the
boilerplate code themselves. I still see needing two types of
enumerations -- integer and string based values. I can't really see
the need for floating point values at this time.
/dev/mrg
On Jan 16, 2008 6:01 PM, Aristedes Maniatis <ar...@ish.com.au> wrote:
>
> On 17/01/2008, at 9:32 AM, Kevin Menard wrote:
>
> > That's largely what I'm trying to avoid. I don't want a bunch of
> > integer
> > values that have no meaning without the corresponding Java code. I
> > suppose
> > SQL comments are one way to go or external documentation, but each
> > has its
> > own set of downsides.
>
> 1. Use VARCHAR instead and use meaningful names: "hold", "cancelled",
> etc. Not an approach I'd take, but an option.
>
> 2. I've been meaning to work on modeler docs for so long now. I've
> even got some code written for this, but not nearly done. [1]
>
> My point is that this enumerated field approach (as opposed to a
> related table) should only be used if the meaning is tied in deeply to
> the code (eg. cancelled invoices are removed from profit and loss
> calculations automatically by the system). So it is inevitable that
> the meaning is embedded within Java. The trick is to make sure it is
> clearly and simply documented. And that you can't assign a value of 10
> to the status field when they only go to 9.
>
> What would be really cool would be the ability within the modeler to
> define an enumeration which is attached to an attribute. You would
> define the values and names. Cayenne would then create the enum class
> with cgen. Then the model describes the complete picture and many more
> people would use enums instead of arbitrary integer values. Right now
> enums are a bit boring to create (I've got one enum attribute in a
> project I work on).
>
>
> [1] https://issues.apache.org/cayenne/browse/CAY-400
>
>
>
> Ari
>
>
> -------------------------->
> ish
> http://www.ish.com.au
> Level 1, 30 Wilson Street Newtown 2042 Australia
> phone +61 2 9550 5001 fax +61 2 9550 4001
> GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A
>
>
>
Re: Enum best practices
Posted by Aristedes Maniatis <ar...@ish.com.au>.
On 17/01/2008, at 9:32 AM, Kevin Menard wrote:
> That's largely what I'm trying to avoid. I don't want a bunch of
> integer
> values that have no meaning without the corresponding Java code. I
> suppose
> SQL comments are one way to go or external documentation, but each
> has its
> own set of downsides.
1. Use VARCHAR instead and use meaningful names: "hold", "cancelled",
etc. Not an approach I'd take, but an option.
2. I've been meaning to work on modeler docs for so long now. I've
even got some code written for this, but not nearly done. [1]
My point is that this enumerated field approach (as opposed to a
related table) should only be used if the meaning is tied in deeply to
the code (eg. cancelled invoices are removed from profit and loss
calculations automatically by the system). So it is inevitable that
the meaning is embedded within Java. The trick is to make sure it is
clearly and simply documented. And that you can't assign a value of 10
to the status field when they only go to 9.
What would be really cool would be the ability within the modeler to
define an enumeration which is attached to an attribute. You would
define the values and names. Cayenne would then create the enum class
with cgen. Then the model describes the complete picture and many more
people would use enums instead of arbitrary integer values. Right now
enums are a bit boring to create (I've got one enum attribute in a
project I work on).
[1] https://issues.apache.org/cayenne/browse/CAY-400
Ari
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001 fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A
Re: Enum best practices
Posted by Kevin Menard <km...@servprise.com>.
On 1/16/08 3:49 PM, "Aristedes Maniatis" <ar...@ish.com.au> wrote:
> I don't quite understand what advantage you get from the above
> approach rather than just creating a table called
> OrderEvaluationStatus and using regular relations?
>
> For my part, I make a separation between attributes which a user might
> (even theoretically) modify and those which are tied into the
> functioning of the code. If a user might change the data (eg. states,
> postcodes, etc) then it belongs in a separate table. If tied
> instrisically to the code (eg. Invoice.status) then I'd create static
> constants in the Invoice table which point to Integer values for
> database storage. A user shouldn't then be able to create a new
> invoice status even if they have direct access to the database since
> their meaning is heavily tied up in how the code works.
> INVOICE_STATUS_CANCELLED, INVOICE_STATUS_PENDING, INVOICE_STATUS_HOLD
> or whatever.
That's largely what I'm trying to avoid. I don't want a bunch of integer
values that have no meaning without the corresponding Java code. I suppose
SQL comments are one way to go or external documentation, but each has its
own set of downsides.
>
> The main trouble with my approach is that there is nothing stopping a
> programming error like
>
> invoice.setStatus(PURCHASE_STATUS_HOLD)
>
> other than the naming convention. Enums should certainly help here.
Correct. I think they enhance clarity in most cases as well.
>
> Have I missed the point of what you were asking?
>
I'm not sure. I may not have been clear enough to start. Do my above
comments help elucidate any?
--
Kevin
Re: Enum best practices
Posted by Aristedes Maniatis <ar...@ish.com.au>.
On 17/01/2008, at 2:43 AM, Kevin Menard wrote:
> The best way I've come up with is if the enum maps its values to the
> PKs of
> the entries in "eval_status" table. I think this would achieve what
> I'd
> like, with the downside that referential integrity may be
> compromised if I
> update one but not the other.
>
> Is this largely what others are doing? Or, do you just use VARCHAR
> columns
> and not worry about the normalization of the DB?
I don't quite understand what advantage you get from the above
approach rather than just creating a table called
OrderEvaluationStatus and using regular relations?
For my part, I make a separation between attributes which a user might
(even theoretically) modify and those which are tied into the
functioning of the code. If a user might change the data (eg. states,
postcodes, etc) then it belongs in a separate table. If tied
instrisically to the code (eg. Invoice.status) then I'd create static
constants in the Invoice table which point to Integer values for
database storage. A user shouldn't then be able to create a new
invoice status even if they have direct access to the database since
their meaning is heavily tied up in how the code works.
INVOICE_STATUS_CANCELLED, INVOICE_STATUS_PENDING, INVOICE_STATUS_HOLD
or whatever.
The main trouble with my approach is that there is nothing stopping a
programming error like
invoice.setStatus(PURCHASE_STATUS_HOLD)
other than the naming convention. Enums should certainly help here.
Have I missed the point of what you were asking?
Ari
-------------------------->
ish
http://www.ish.com.au
Level 1, 30 Wilson Street Newtown 2042 Australia
phone +61 2 9550 5001 fax +61 2 9550 4001
GPG fingerprint CBFB 84B4 738D 4E87 5E5C 5EFA EF6A 7D2E 3E49 102A