You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cayenne.apache.org by "Musall, Maik" <ma...@selbstdenker.ag> on 2017/03/06 21:25:15 UTC

Fetching lots of objects

Hi all,

I have a number of statistics functions which need to fetch large amounts of objects. I need the actual DataObjects because that's where the business logic is that I need for the computations.

Let's say I need to fetch 300.000 objects. Let's also assume the database sits on a fast SSD array and can serve multiple connections easily. I'm assuming in this case the CPU time needed for DataObject instantiation is the main performance constraint. Is that correct?

If so, how can I speed this up? Could I partition my fetch, and fetch in several threads in parallel into the same ObjectContext? Or is there an easier way to make use of multiple CPU cores for this?

Thanks
Maik

Re: Fetching lots of objects

Posted by "Musall, Maik" <ma...@selbstdenker.ag>.

Hi Marcel,

I know how to do the actual computation in parallel. My question is how to fetch and instantiate the DataObjects in parallel before I can start the computations. An iterator would only slow down the fetch because of the added roundtrips. Iterators are about reducing memory footprint, while I am not memory-constrained here.

Maik

> Am 07.03.2017 um 08:30 schrieb Markus Reich <ma...@markusreich.at>:
> 
> Hi Maik,
> 
> maybe you can use the new iterator and split the iterator for parallel
> computation?
> 
> public static <T> Stream<T> asStream(Iterator<T> sourceIterator, boolean
> parallel) {
>   Iterable<T> iterable = () -> sourceIterator;
>   return StreamSupport.stream(iterable.spliterator(), parallel);
> }
> 
> found at
> http://stackoverflow.com/questions/24511052/how-to-convert-an-iterator-to-a-stream
> 
> br
> Meex
> 
> Musall, Maik <ma...@selbstdenker.ag> schrieb am Mo., 6. März 2017 um
> 22:25 Uhr:
> 
>> Hi all,
>> 
>> I have a number of statistics functions which need to fetch large amounts
>> of objects. I need the actual DataObjects because that's where the business
>> logic is that I need for the computations.
>> 
>> Let's say I need to fetch 300.000 objects. Let's also assume the database
>> sits on a fast SSD array and can serve multiple connections easily. I'm
>> assuming in this case the CPU time needed for DataObject instantiation is
>> the main performance constraint. Is that correct?
>> 
>> If so, how can I speed this up? Could I partition my fetch, and fetch in
>> several threads in parallel into the same ObjectContext? Or is there an
>> easier way to make use of multiple CPU cores for this?
>> 
>> Thanks
>> Maik
>> 
>>

Re: Fetching lots of objects

Posted by Markus Reich <ma...@markusreich.at>.

Hi Maik,

maybe you can use the new iterator and split the iterator for parallel
computation?

public static <T> Stream<T> asStream(Iterator<T> sourceIterator, boolean
parallel) {
   Iterable<T> iterable = () -> sourceIterator;
   return StreamSupport.stream(iterable.spliterator(), parallel);
}

found at
http://stackoverflow.com/questions/24511052/how-to-convert-an-iterator-to-a-stream

br
Meex

Musall, Maik <ma...@selbstdenker.ag> schrieb am Mo., 6. März 2017 um
22:25 Uhr:

> Hi all,
>
> I have a number of statistics functions which need to fetch large amounts
> of objects. I need the actual DataObjects because that's where the business
> logic is that I need for the computations.
>
> Let's say I need to fetch 300.000 objects. Let's also assume the database
> sits on a fast SSD array and can serve multiple connections easily. I'm
> assuming in this case the CPU time needed for DataObject instantiation is
> the main performance constraint. Is that correct?
>
> If so, how can I speed this up? Could I partition my fetch, and fetch in
> several threads in parallel into the same ObjectContext? Or is there an
> easier way to make use of multiple CPU cores for this?
>
> Thanks
> Maik
>
>

Re: Fetching lots of objects

Posted by "Musall, Maik" <ma...@selbstdenker.ag>.

Hi Andrus,

I'm continuing this on the dev@ list if you don't mind?

> Am 08.03.2017 um 20:13 schrieb Andrus Adamchik <an...@objectstyle.org>:
> 
>> It would be nice if Cayenne would internally parallelize things like ObjectResolver.objectsFromDataRows() and use lock-free strategies to deal with the caching.
> 
> This is probably the last (and consequently the worst) place in Cayenne where locking still occurs. After I encountered this problem in a high-concurrency system, I've done some analysis of it (see [1] and also [2]), and this has been my "Cayenne 5.0" plan for a long time. With 4.0 making such progress as it does now, we may actually start contemplating it again.
> 
> Andrus
> 
> 
> [1] https://lists.apache.org/thread.html/b3a990f94a8db3818c7f12eb433a8fef89d5e0afee653def11da1aa9@1382717376@%3Cdev.cayenne.apache.org%3E
> [2] https://lists.apache.org/thread.html/bfcf79ffa521e402d080e3aafc5f0444fa0ab7d09045ec3092aee6c2@1382706785@%3Cdev.cayenne.apache.org%3E

Interesting read!

Regarding the array-based DataObject concept, wouldn't this mean for name-based attribute lookups that you still need a map somewhere that translates names to indexes? That map would only be needed once per entity, however.

Instead of the array-based approach, did you also consider ConcurrentHashMap and similar classes in java.util.concurrent? It would not have all the other advantages besides concurrency, but could perhaps serve as an easy intermediate step to get rid of the locking, and be implemented even in 4.0 already.

And on the [1] discussion, I'd like to mention my use case again: big queries with lots of prefetches to suck in gigabytes of data for aggregate computations using DataObject business logic. During those fetches, other users expect to be able to continue their regular workload concurrently (which they mostly cannot using EOF: my main reason to switch). So however this [1] concept turns out, I'd like to also be able to parallelize the fetches themselves. A useful first step would be to execute disjoint prefetches in separate threads.

A second step could be to have even a single big table scan query parallelized by partioning. Databases have been able to organize large tables into partitions that can be scanned independently from each other. Back in the days with Oracle and slower spinning disks you would spread partitions between independent disks, while today with SSDs and zero seek time that could still help to increase the throughput when CPU is the limiting factor (databases also tend to generate high CPU loads when doing full table scans, but only on one core per scan). An idea could be to include a partitioning criterium in the model, which matches the database's criterium for the table in question.

In the meantime I could try partitioning the queries on the application level, which can also work, but I'm back at the Graph Manager locking problem when merging them into one context for processing.

Today's hardware with databases on SSDs that can deliver 3 GByte/s or more, and 16+ cores for processing calls for parallelization on every level.

Maik

Re: Fetching lots of objects

Posted by Andrus Adamchik <an...@objectstyle.org>.

Hi Maik,

> On Mar 8, 2017, at 7:47 PM, Musall, Maik <ma...@selbstdenker.ag> wrote:
> 
> Well, if I need them all in the same context to work with after this, I would then need to localObject() them and be back at locking, this time against the graph manager. Dang.

Yes. Unfortunately.

> It would be nice if Cayenne would internally parallelize things like ObjectResolver.objectsFromDataRows() and use lock-free strategies to deal with the caching.

This is probably the last (and consequently the worst) place in Cayenne where locking still occurs. After I encountered this problem in a high-concurrency system, I've done some analysis of it (see [1] and also [2]), and this has been my "Cayenne 5.0" plan for a long time. With 4.0 making such progress as it does now, we may actually start contemplating it again.

Andrus

[1] https://lists.apache.org/thread.html/b3a990f94a8db3818c7f12eb433a8fef89d5e0afee653def11da1aa9@1382717376@%3Cdev.cayenne.apache.org%3E
[2] https://lists.apache.org/thread.html/bfcf79ffa521e402d080e3aafc5f0444fa0ab7d09045ec3092aee6c2@1382706785@%3Cdev.cayenne.apache.org%3E

Re: Fetching lots of objects

Posted by "Musall, Maik" <ma...@selbstdenker.ag>.

Whoa. Parallel instantiation down to <2700 ms using multiple threads with a local ObjectContext each.

Well, if I need them all in the same context to work with after this, I would then need to localObject() them and be back at locking, this time against the graph manager. Dang. It would be nice if Cayenne would internally parallelize things like ObjectResolver.objectsFromDataRows() and use lock-free strategies to deal with the caching.


> Am 08.03.2017 um 14:17 schrieb John Huss <jo...@gmail.com>:
> 
> If parallel is going to have any benefit you have to be using separate
> object contexts to avoid locking the same DataRow cache.
> On Wed, Mar 8, 2017 at 5:59 AM Musall, Maik <ma...@selbstdenker.ag> wrote:
> 
>> 
>>> Am 08.03.2017 um 10:56 schrieb Aristedes Maniatis <ar...@maniatis.org>:
>>> 
>>> On 8/3/17 6:54pm, Musall, Maik wrote:
>>> 
>>>> regular SelectQuery: 25888 ms for 1291644 objects
>>>> DataRowQuery alone: 14289 ms for 1291644 rows
>>>> DataRowQuery sequential instantiation: 6878 ms for 1291644 objects, sum
>> = 21167
>>>> DataRowQuery parallel instantiation: 7351 ms for 1291644 objects, sum =
>> 21640
>>>> DataRowQuery with iterator: 22484 ms for 1291644 objects
>>>> DataRowQuery with batch iterator of 100 each: 21219 ms for 1291644
>> objects
>>> 
>>> What about trying the new M5 release from yesterday and its ability to
>> select just the columns you need. You'll just get a list of column data
>> instead of a simpler object model, but it might be faster.
>>> 
>> 
>> This is M5 already (M6-SNAPSHOT really). But I need the full objects
>> because I need to do computations on them using the business logic
>> implemented in the DataObject class.
>> 
>> Maik
>> 
>>

Re: Fetching lots of objects

Posted by John Huss <jo...@gmail.com>.

If parallel is going to have any benefit you have to be using separate
object contexts to avoid locking the same DataRow cache.
On Wed, Mar 8, 2017 at 5:59 AM Musall, Maik <ma...@selbstdenker.ag> wrote:

>
> > Am 08.03.2017 um 10:56 schrieb Aristedes Maniatis <ar...@maniatis.org>:
> >
> > On 8/3/17 6:54pm, Musall, Maik wrote:
> >
> >> regular SelectQuery: 25888 ms for 1291644 objects
> >> DataRowQuery alone: 14289 ms for 1291644 rows
> >> DataRowQuery sequential instantiation: 6878 ms for 1291644 objects, sum
> = 21167
> >> DataRowQuery parallel instantiation: 7351 ms for 1291644 objects, sum =
> 21640
> >> DataRowQuery with iterator: 22484 ms for 1291644 objects
> >> DataRowQuery with batch iterator of 100 each: 21219 ms for 1291644
> objects
> >
> > What about trying the new M5 release from yesterday and its ability to
> select just the columns you need. You'll just get a list of column data
> instead of a simpler object model, but it might be faster.
> >
>
> This is M5 already (M6-SNAPSHOT really). But I need the full objects
> because I need to do computations on them using the business logic
> implemented in the DataObject class.
>
> Maik
>
>

Re: Fetching lots of objects

Posted by "Musall, Maik" <ma...@selbstdenker.ag>.

> Am 08.03.2017 um 10:56 schrieb Aristedes Maniatis <ar...@maniatis.org>:
> 
> On 8/3/17 6:54pm, Musall, Maik wrote:
> 
>> regular SelectQuery: 25888 ms for 1291644 objects
>> DataRowQuery alone: 14289 ms for 1291644 rows
>> DataRowQuery sequential instantiation: 6878 ms for 1291644 objects, sum = 21167
>> DataRowQuery parallel instantiation: 7351 ms for 1291644 objects, sum = 21640
>> DataRowQuery with iterator: 22484 ms for 1291644 objects
>> DataRowQuery with batch iterator of 100 each: 21219 ms for 1291644 objects
> 
> What about trying the new M5 release from yesterday and its ability to select just the columns you need. You'll just get a list of column data instead of a simpler object model, but it might be faster.
> 

This is M5 already (M6-SNAPSHOT really). But I need the full objects because I need to do computations on them using the business logic implemented in the DataObject class.

Maik

Re: Fetching lots of objects

Posted by Aristedes Maniatis <ar...@maniatis.org>.

On 8/3/17 6:54pm, Musall, Maik wrote:

> regular SelectQuery: 25888 ms for 1291644 objects
> DataRowQuery alone: 14289 ms for 1291644 rows
> DataRowQuery sequential instantiation: 6878 ms for 1291644 objects, sum = 21167
> DataRowQuery parallel instantiation: 7351 ms for 1291644 objects, sum = 21640
> DataRowQuery with iterator: 22484 ms for 1291644 objects
> DataRowQuery with batch iterator of 100 each: 21219 ms for 1291644 objects

What about trying the new M5 release from yesterday and its ability to select just the columns you need. You'll just get a list of column data instead of a simpler object model, but it might be faster.

Ari



-- 
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A

Re: Fetching lots of objects

Posted by "Musall, Maik" <ma...@selbstdenker.ag>.

Hi Ari,

> Am 07.03.2017 um 23:14 schrieb Aristedes Maniatis <ar...@maniatis.org>:
> 
> On 7/3/17 8:25am, Musall, Maik wrote:
>> Hi all,
>> 
>> I have a number of statistics functions which need to fetch large amounts of objects. I need the actual DataObjects because that's where the business logic is that I need for the computations.
>> 
>> Let's say I need to fetch 300.000 objects. Let's also assume the database sits on a fast SSD array and can serve multiple connections easily. I'm assuming in this case the CPU time needed for DataObject instantiation is the main performance constraint. Is that correct?
>> 
>> If so, how can I speed this up? Could I partition my fetch, and fetch in several threads in parallel into the same ObjectContext? Or is there an easier way to make use of multiple CPU cores for this?
> 
> 
> I don't think there is anything in Cayenne that will specifically help you here. However if you can partition your search query, the of course you can fetch the data in multiple threads in parallel.
> 
> You might also want to fetch into DataRows rather than creating object entities. I'm not sure if that will make your use case faster, but you could try, especially if you don't need all the columns from the db entity.

I tried that already. Results:

regular SelectQuery: 25888 ms for 1291644 objects
DataRowQuery alone: 14289 ms for 1291644 rows
DataRowQuery sequential instantiation: 6878 ms for 1291644 objects, sum = 21167
DataRowQuery parallel instantiation: 7351 ms for 1291644 objects, sum = 21640
DataRowQuery with iterator: 22484 ms for 1291644 objects
DataRowQuery with batch iterator of 100 each: 21219 ms for 1291644 objects

sequential/parallel was stream() vs. parallelStream(). The difference between parallel and sequential instantiation was random.

So, all in all not that much of a difference. The DataRowQuery alone is faster of course, but once you add the instantiation, it ends up in the same ballpark as the regular SelectQuery. A bit faster, but probably not worth the additional coding, or deviating from the regular APIs.

Consistently fastest was doing the parallel fetch: DataRowQuery parallel fetch+instantiation: 19357 ms for 1291644 objects. I partitioned the fetch into 4 pieces (exprs is a list of 4 expressions), and then did:

	List<PDCMarketingInfo> objects = exprs.parallelStream()
		.flatMap( exp -> {
			SelectQuery<DataRow> dataRowQuery = SelectQuery.dataRowQuery( PDCMarketingInfo.class, exp );
			List<DataRow> dataRows = dataRowQuery.select( oc );
			return dataRows.parallelStream().map( row -> oc.objectFromDataRow( PDCMarketingInfo.class, row ) );
		} )
		.collect( Collectors.toList() );

I also did this with iterator instead of dataRowQuery.select(), but that was slower.

There may be more benefit from parallelization depending on the hardware used. This was my 2013 MBP with 4 i7 cores.

Maik

Re: Fetching lots of objects

Posted by Aristedes Maniatis <ar...@maniatis.org>.

On 7/3/17 8:25am, Musall, Maik wrote:
> Hi all,
> 
> I have a number of statistics functions which need to fetch large amounts of objects. I need the actual DataObjects because that's where the business logic is that I need for the computations.
> 
> Let's say I need to fetch 300.000 objects. Let's also assume the database sits on a fast SSD array and can serve multiple connections easily. I'm assuming in this case the CPU time needed for DataObject instantiation is the main performance constraint. Is that correct?
> 
> If so, how can I speed this up? Could I partition my fetch, and fetch in several threads in parallel into the same ObjectContext? Or is there an easier way to make use of multiple CPU cores for this?

I don't think there is anything in Cayenne that will specifically help you here. However if you can partition your search query, the of course you can fetch the data in multiple threads in parallel.

You might also want to fetch into DataRows rather than creating object entities. I'm not sure if that will make your use case faster, but you could try, especially if you don't need all the columns from the db entity.

Ari

-- 
-------------------------->
Aristedes Maniatis
GPG fingerprint CBFB 84B4 738D 4E87 5E5C  5EFA EF6A 7D2E 3E49 102A