You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Christopher Schultz <ch...@christopherschultz.net> on 2007/12/12 00:14:13 UTC

Re: [OT] Help with java Lists

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David,

You need a collection that is:

David kerber wrote:
> I need to have some kind of list or collection that I can search quickly
> for a specific entry

Hashed

> and then start stepping through the list item by
> item from that point.

Linked

The only implementation that meets those needs is a LinkedHashSet.
Unfortunately, there's no way to start iterating from the middle of this
type of data structure (nor any Set for that matter). Sets do not have
an ordering, but a LinkedHashSet provides an order. On the other hand,
it does not provide a way to ask "where is this item in the set"?

For that, you need a List.

I would recommend building your own data type that contains both a List
(for identifying items by index and performing manual iterating rather
than using an Iterator) and a HashMap or something similar for quick
lookups. Store your object -> list index in the map and the objects
themselves in the list. You can even write your own iterator to start in
the middle of a list once you've identified where to start.

> If it seems like I'll never get reasonable speed this way, I could
> switch to calling all the stores' data from the database at once, making
> the lists huge, but only needing to load them once.  However, this makes
> speed in searching the lists much more of an issue, and I don't know
> which way is going to give me the best overall performance for this
> report generation.

Sounds like a lot of premature optimization to me. Why not just read
what you need from the database each time?

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHXxnF9CaO5/Lv0PARAvfdAKCrytFSU9v+2OKGP+0egybQYy+fkwCdGVm3
cGQmgZUDc82xeGg6gsT3ziM=
=wYjJ
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [OT] Help with java Lists

Posted by David Kerber <dc...@verizon.net>.
Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> David,
>
> David kerber wrote:
>   
>> My primary job is database design and management; the java side is where
>> I'm weak, and I've spent a LOT of time on these queries, including
>> pushing through some design changes that have helped a lot, but not enough.
>>     
>
> Okay. Forgive my comments about proper database care and feeding...
> usually this list attracts programmers that don't know much about databases.
>   
No problem; I took it in the spirit it was intended (trying to be helpful).

>   
>>> map.put(myDataObj, myDataObj)?
>>>   
>>>       
>> I thought of that, but there is a lot more data in the object than just
>> the stuff I need to search on.  Once I identify the particular object
>> instance I need (which only needs those 3 fields), there are about a
>> dozen other fields I need to retrieve from it to process each record.
>>     
>
> That's fine. If you only use 3 fields for identification, there's no
> harm in keeping all that data together in a single object. It's not like
> it uses up any more memory. Actually, you might be saving a bit of
> memory since you won't need separate key and value objects.
>
>   
>>> When in doubt, abstract: write yourself a helper class that bundles all
>>> this odd logic together. Then, your business code will be a lot simpler
>>> to read, and you can separately unit test your
>>> SpecializedTreeSetHashMapLinkedList class (or whatever). ;)
>>>   
>>>       
>> Yeah, that's what I'm working on right now.  It's coming together,
>> though somewhat slowly since I'm still getting my head around the
>> different kinds of maps and lists, and what each of them is good for.
>>     
>
> It's always worth it to read though the entire java.util API. There are
> some great classes in there.
>   
Yeah, I'm discovering that.  It seems like every time I ask for 
suggestions about how to do something that I'm struggling with in my 
app, somebody suggests something from java.util, and it ends up working 
great.  This time around, it was TreeMap, which led me to SortedMap, 
which so far (about 80% done) looks like it's going to work great.  
After a couple of false starts, I'm using hash maps with the Key being 
the pKey of my db objects (which I already have available from other 
objects in the app), and the Value being a SortedMap of the data for 
each object.  This gives me instant access to the map I need, and a 
separate HashMap keeps the pointer to the key I need to track my place 
in each SortedMap.  I'll post back when I'm done as to how it goes, but 
so far it's looking good.  If I were using java 6, I would have some 
even more helpful navigation methods, such as higherKey(), but what I've 
got isn't bad.

D



---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [OT] Help with java Lists

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David,

David kerber wrote:
> My primary job is database design and management; the java side is where
> I'm weak, and I've spent a LOT of time on these queries, including
> pushing through some design changes that have helped a lot, but not enough.

Okay. Forgive my comments about proper database care and feeding...
usually this list attracts programmers that don't know much about databases.

>> map.put(myDataObj, myDataObj)?
>>   
> I thought of that, but there is a lot more data in the object than just
> the stuff I need to search on.  Once I identify the particular object
> instance I need (which only needs those 3 fields), there are about a
> dozen other fields I need to retrieve from it to process each record.

That's fine. If you only use 3 fields for identification, there's no
harm in keeping all that data together in a single object. It's not like
it uses up any more memory. Actually, you might be saving a bit of
memory since you won't need separate key and value objects.

>> When in doubt, abstract: write yourself a helper class that bundles all
>> this odd logic together. Then, your business code will be a lot simpler
>> to read, and you can separately unit test your
>> SpecializedTreeSetHashMapLinkedList class (or whatever). ;)
>>   
> Yeah, that's what I'm working on right now.  It's coming together,
> though somewhat slowly since I'm still getting my head around the
> different kinds of maps and lists, and what each of them is good for.

It's always worth it to read though the entire java.util API. There are
some great classes in there.

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHYFnP9CaO5/Lv0PARAr/dAJ90FhUTXph1u7lOgnnbjXub1xeu4gCgvLRV
cmlvpiSB4wDMR8xOPJve8a0=
=GIZT
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [OT] Help with java Lists

Posted by David kerber <dc...@verizon.net>.
Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> David,
>
>   
>> I'm already on my 3rd attempt at optimization for this section.  The
>>  first round was having the db do _all_ the work, submitting a
>> complex query (a view, actually) and returning a resultset with all
>> the data I need. The query took forever, and the java code to produce
>> the report was really fast.
>>     
>
> I would bet on the speed of the (properly indexed and tuned) database
> against Java any day, honestly. These things were built to perform these
>   
The major slowdown in the db is related to a weak (not horrible, but not 
great either) initial db design, plus some design requirements that make 
writing clean queries nearly impossible.  There are some things that 
java (or most any programming language) is better than databases at, and 
the comparing (not just summing up) data from sequential data records is 
one of them (unless you are using a specialized data warehouse database, 
which we are not).

> types of requests. No offense to your abilities, but have you had a DBA
> check out your tables and indexes to see if they are properly supporting
> the queries you are attempting to execute? It's easy to write a
> long-running query, but just a few tweaks can reduce the query time by
> orders of magnitude.
>   
My primary job is database design and management; the java side is where 
I'm weak, and I've spent a LOT of time on these queries, including 
pushing through some design changes that have helped a lot, but not enough.

>> One issue I ran into earlier today with plain HashMaps was that I 
>> couldn't figure out how to search for a specific piece of data, which
>>  requires matching on a site number, a date and a shift, and for some
>>  data another date.
>>     
>
> Right: you need to hash the 3 data items together. Typically, you would
> write a class that bundles the 3 data together and provides a hashCode
> method that incorporates them and an equals method that checks them.
> Then, use that as the key to your map. Is there any reason why you can't
> use the data objects themselves in this way?
>
> map.put(myDataObj, myDataObj)?
>   
I thought of that, but there is a lot more data in the object than just 
the stuff I need to search on.  Once I identify the particular object 
instance I need (which only needs those 3 fields), there are about a 
dozen other fields I need to retrieve from it to process each record.

>   
>> I have a workable solution right now, though the code is rather more
>> complex than I would like to keep my navigation of all those TreeMaps
>> in sync.
>>     
>
> When in doubt, abstract: write yourself a helper class that bundles all
> this odd logic together. Then, your business code will be a lot simpler
> to read, and you can separately unit test your
> SpecializedTreeSetHashMapLinkedList class (or whatever). ;)
>   
Yeah, that's what I'm working on right now.  It's coming together, 
though somewhat slowly since I'm still getting my head around the 
different kinds of maps and lists, and what each of them is good for.

Thanks!
D



---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [OT] Help with java Lists

Posted by Christopher Schultz <ch...@christopherschultz.net>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

David,

> I'm already on my 3rd attempt at optimization for this section.  The
>  first round was having the db do _all_ the work, submitting a
> complex query (a view, actually) and returning a resultset with all
> the data I need. The query took forever, and the java code to produce
> the report was really fast.

I would bet on the speed of the (properly indexed and tuned) database
against Java any day, honestly. These things were built to perform these
types of requests. No offense to your abilities, but have you had a DBA
check out your tables and indexes to see if they are properly supporting
the queries you are attempting to execute? It's easy to write a
long-running query, but just a few tweaks can reduce the query time by
orders of magnitude.

> One issue I ran into earlier today with plain HashMaps was that I 
> couldn't figure out how to search for a specific piece of data, which
>  requires matching on a site number, a date and a shift, and for some
>  data another date.

Right: you need to hash the 3 data items together. Typically, you would
write a class that bundles the 3 data together and provides a hashCode
method that incorporates them and an equals method that checks them.
Then, use that as the key to your map. Is there any reason why you can't
use the data objects themselves in this way?

map.put(myDataObj, myDataObj)?

> I have a workable solution right now, though the code is rather more
> complex than I would like to keep my navigation of all those TreeMaps
> in sync.

When in doubt, abstract: write yourself a helper class that bundles all
this odd logic together. Then, your business code will be a lot simpler
to read, and you can separately unit test your
SpecializedTreeSetHashMapLinkedList class (or whatever). ;)

- -chris
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHX/sQ9CaO5/Lv0PARAjN0AJ9tgdFORyqM8VdGJjoWj50Ihvw0UgCgg8wN
+JCz1Ve/reJXVRGw8mPvrZY=
=bJvv
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: [OT] Help with java Lists

Posted by David Kerber <dc...@verizon.net>.
Christopher Schultz wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> David,
>
> You need a collection that is:
>
> David kerber wrote:
>   
>> I need to have some kind of list or collection that I can search quickly
>> for a specific entry
>>     
>
> Hashed
>
>   
>> and then start stepping through the list item by
>> item from that point.
>>     
>
> Linked
>
> The only implementation that meets those needs is a LinkedHashSet.
> Unfortunately, there's no way to start iterating from the middle of this
> type of data structure (nor any Set for that matter). Sets do not have
> an ordering, but a LinkedHashSet provides an order. On the other hand,
> it does not provide a way to ask "where is this item in the set"?
>
> For that, you need a List.
>
> I would recommend building your own data type that contains both a List
> (for identifying items by index and performing manual iterating rather
> than using an Iterator) and a HashMap or something similar for quick
> lookups. Store your object -> list index in the map and the objects
> themselves in the list. You can even write your own iterator to start in
> the middle of a list once you've identified where to start.
>
>   
>> If it seems like I'll never get reasonable speed this way, I could
>> switch to calling all the stores' data from the database at once, making
>> the lists huge, but only needing to load them once.  However, this makes
>> speed in searching the lists much more of an issue, and I don't know
>> which way is going to give me the best overall performance for this
>> report generation.
>>     
>
> Sounds like a lot of premature optimization to me. Why not just read
> what you need from the database each time?
>   
I'm already on my 3rd attempt at optimization for this section.  The 
first round was having the db do _all_ the work, submitting a complex 
query (a view, actually) and returning a resultset with all the data I 
need.  The query took forever, and the java code to produce the report 
was really fast. 

2nd attempt was just the opposite:  the db calls were quick but the java 
code slowed everything down because looped through the list of sites, 
and for each site reads each its data from the db's base tables (5 of 
them), putting each table's data into a resultset and loads the 
resultset data into an iterable collection (ArrayList).  I haven't fully 
isolated the bottleneck of this version, but I think the slowness was 
due to searching the lists for the appropriate data. 

The 3rd one I'm trying now keeps the db queries simple, but I'm trying 
to be more intelligent in my handling of the lists, by using TreeMaps 
and keeping a set of pointers so I don't need to search the lists for 
each piece of data, but can gather all the report data with a single 
scan through each TreeMap.


One issue I ran into earlier today with plain HashMaps was that I 
couldn't figure out how to search for a specific piece of data, which 
requires matching on a site number, a date and a shift, and for some 
data another date.  However, as I was typing this I realized that I just 
need to generate a hash key from those fields when loading up the 
HashMap, and then generate a search key using the same algorithm when I 
need to find a given item.  Tomorrow I'll revisit this to see if it 
helps me out, but I think I have a workable solution right now, though 
the code is rather more complex than I would like to keep my navigation 
of all those TreeMaps in sync.

Thanks for the comments!
D



---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org