You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@commons.apache.org by Dan Fabulich <da...@fabulich.com> on 2009/02/10 08:32:34 UTC

[dbutils] Questions about a few of bugs

I had some questions about a few remaining bugs... maybe somebody here can 
help...

1) OptimalBeanListHandler

http://issues.apache.org/jira/browse/DBUTILS-37
This bug provides a patch that considerably improves the performance of 
BeanListHandler.

However, Julien correctly points out that:
> With the change I propose, the BeanListHandler#handleRow(ResultSet) 
> method is never called anymore. So, any subclass of BeanListHandler 
> which would override this method would not work properly anymore.

Julien's patch creates a separate "OptimalBeanListHandler" ... but that 
really shouldn't be necessary, right?

We should be able to just replace the existing BLH with this fast one 
without worrying about users who may have extended BLH#handleRow, I'd 
think... At the very least I think nobody should use the old slow one by 
accident.

2) Inflate existing object
http://issues.apache.org/jira/browse/DBUTILS-28

I'm a little skeptical of the patch for this bug.  Is this issue 
important?  (It has 0 votes and 0 watchers, so I think not.)

3) CaseInsensitiveMap
http://issues.apache.org/jira/browse/DBUTILS-34

Two distinct users claim to want CIM to return the keys in their ORIGINAL 
case (i.e. not lowercased).  Why would anyone need this?  I don't get it.

-Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [dbutils] Questions about a few of bugs

Posted by Liam Coughlin <ls...@gmail.com>.

snip

>
> Ah, but the whole point of this enhancement is that it's unperformant to
> use RowProcessor#toBean on each row; it's much faster to call
> RowProcessor#toBeanList once.  So there's no such thing as an
> OptimizedBeanRowHandler; the optimization is to not use a RowHandler!
>
/snip

I'd push back in the complete opposite direction.  You should _always_ use a
rowhandler and the extra fuctions on row handler should all be collapsed to
a single handleRow function -- ditch toBean and toList etc. all together.

ListHandler<MyBeanType> = new ArrayListHandler<MyBeanType>( new
BeanRowHandler<MyBeanType>() )

and the new theoretical BeanRowHandler would only create the index/property
map once per handled resultset ( the real source of the performance issue in
the first place ).

it's a little verbose on the generics but the guts of processing stream
would be significantly simpler ( and thus -- faster ).

Addidtionally you could have a dbutils extra package wich could have
something like a CglibBeanRowHandler or what have you that could generate
and cache classes to execute resulset->bean mappings.




>
>  3) CaseInsensitiveMap
>>> http://issues.apache.org/jira/browse/DBUTILS-34
>>>
>>> Two distinct users claim to want CIM to return the keys in their ORIGINAL
>>> case (i.e. not lowercased).  Why would anyone need this?  I don't get it.
>>>
>>>
>>>  Given the amount of reflection and property referencing that occurs in
>> java
>> apps these days, i can think of a million reasons why a CIM should return
>> the keys in original case.  There shouldn't be any particular performance
>> implication here, just a little bit of extra memory and housekeeping.
>>
>
> Well, here's my thinking:
> 1) We don't provide a public CaseInsensitiveMap in our API.  The CIM is a
> purely private object used in the guts of BasicRowProcessor.
>
> This is why I don't understand the point of the bug...  Why would anyone
> care how this internal object is implemented, so long as BasicRowProcessor
> works?
>

Well the problem is that BasicRowProcessor.toMap returns a CIM.  -- so if
you use the ListMapHandler or any of the other map result functions you get
a CIM back.  This is useful because developers can then interact with the
result map without worring about column casing from the database -- except
where they care about column casing with the database.... sense?
.entrySet() and .keySet() should of returned Strings in original case in the
first place -- shrug.


>
> 2) Extra housekeeping = slower performance
>

True, balancing act.


>
> 3) The biggest risk here is accidentally introducing thread-safety
> problems.  The patch applied to DBUTILS-34 is not thread-safe (I think?)
> because it keeps two internal data structures (the "real" map and the
> "lowerCaseMap") that can fall out-of-sync, corrupting the map.
>

I haven't looked at the patch itself, but in the laziest case, the
granularity where the race condition could exist -- the race condition still
exists by itself since CIM extends HashMap rather then hashtable  and so
isn't threadsafe per se, and CIM is never wrapped in a failfast wrapper and
( obviously ) doesn't extent ConcurrentHashMap.

Again -- balancing act -- simple API vs eeking out a couple of small points
of performance.

Re: [dbutils] Questions about a few of bugs

Posted by Dan Fabulich <da...@fabulich.com>.

Liam Coughlin wrote:

> On Tue, Feb 10, 2009 at 2:32 AM, Dan Fabulich <da...@fabulich.com> wrote:
>
>> 1) OptimalBeanListHandler
>
> Ideally, these classes should be split up more.  There shouldn't really be a
> BeanListHandler, just a ListHandler which takes a RowHandler on
> construction, and then you can just use the appropriate BeanRowHandler or
> OptimizedBeanRowHandler or some other custom row handler to map how you like
> which would resolve the whole issue.

Ah, but the whole point of this enhancement is that it's unperformant to 
use RowProcessor#toBean on each row; it's much faster to call 
RowProcessor#toBeanList once.  So there's no such thing as an 
OptimizedBeanRowHandler; the optimization is to not use a RowHandler!

>> 3) CaseInsensitiveMap
>> http://issues.apache.org/jira/browse/DBUTILS-34
>>
>> Two distinct users claim to want CIM to return the keys in their ORIGINAL
>> case (i.e. not lowercased).  Why would anyone need this?  I don't get it.
>>
>>
> Given the amount of reflection and property referencing that occurs in java
> apps these days, i can think of a million reasons why a CIM should return
> the keys in original case.  There shouldn't be any particular performance
> implication here, just a little bit of extra memory and housekeeping.

Well, here's my thinking:
1) We don't provide a public CaseInsensitiveMap in our API.  The CIM is a 
purely private object used in the guts of BasicRowProcessor.

This is why I don't understand the point of the bug...  Why would anyone 
care how this internal object is implemented, so long as BasicRowProcessor 
works?

2) Extra housekeeping = slower performance

3) The biggest risk here is accidentally introducing thread-safety 
problems.  The patch applied to DBUTILS-34 is not thread-safe (I think?) 
because it keeps two internal data structures (the "real" map and the 
"lowerCaseMap") that can fall out-of-sync, corrupting the map.

-Dan

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@commons.apache.org
For additional commands, e-mail: dev-help@commons.apache.org

Re: [dbutils] Questions about a few of bugs

Posted by Liam Coughlin <ls...@gmail.com>.

On Tue, Feb 10, 2009 at 2:32 AM, Dan Fabulich <da...@fabulich.com> wrote:

> 1) OptimalBeanListHandler
>
> http://issues.apache.org/jira/browse/DBUTILS-37
> This bug provides a patch that considerably improves the performance of
> BeanListHandler.
>
> However, Julien correctly points out that:
>
>> With the change I propose, the BeanListHandler#handleRow(ResultSet) method
>> is never called anymore. So, any subclass of BeanListHandler which would
>> override this method would not work properly anymore.
>>
>
> Julien's patch creates a separate "OptimalBeanListHandler" ... but that
> really shouldn't be necessary, right?
>
> We should be able to just replace the existing BLH with this fast one
> without worrying about users who may have extended BLH#handleRow, I'd
> think... At the very least I think nobody should use the old slow one by
> accident.
>

Ideally, these classes should be split up more.  There shouldn't really be a
BeanListHandler, just a ListHandler which takes a RowHandler on
construction, and then you can just use the appropriate BeanRowHandler or
OptimizedBeanRowHandler or some other custom row handler to map how you like
which would resolve the whole issue.

Yes, this would be a fairly drastic API change -- one that shold be looked
at pretty seriously for java 5 ports of dbutils.

>
> 2) Inflate existing object
> http://issues.apache.org/jira/browse/DBUTILS-28
>
> I'm a little skeptical of the patch for this bug.  Is this issue important?
>  (It has 0 votes and 0 watchers, so I think not.)
>

Bean reuse handler is a little odd -- but again it comes back to splitting
out a row handler.  The simplest "real"  use case is in mapping database
generated primary keys back into objects for insert statements.

Stored procedure handling is an entirely different issue, and one of the few
places where database specific handlers will probably be needed which is why
it's been avoided so far i think.

> 3) CaseInsensitiveMap
> http://issues.apache.org/jira/browse/DBUTILS-34
>
> Two distinct users claim to want CIM to return the keys in their ORIGINAL
> case (i.e. not lowercased).  Why would anyone need this?  I don't get it.
>
>
Given the amount of reflection and property referencing that occurs in java
apps these days, i can think of a million reasons why a CIM should return
the keys in original case.  There shouldn't be any particular performance
implication here, just a little bit of extra memory and housekeeping.

Cheers,
-L