You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Lon Varscsak <lo...@gmail.com> on 2016/10/04 15:59:58 UTC

Re: Batch Fetch

So, I finally got back around to this (still no solution).  What you’re
describing isn’t really what I’m after.  What I’d like to do is take an
object (or array of objects) and a list of keys and have Cayenne basically
do the same fetching it does with it’s pre-fetching code.

So when I use addPrefetch to a query, by default it will execute a fetch
for each of the relationships.  This allows me to decide at the application
level what keys are important to me.  So something like this:

CayenneUtilities.batchFetch(myDataObjects, “toOne”, “toMany”). It would
execute a query to find all the objects in “toOne” for all the objects in
myDataObjects (and the same for “toMany”).  Based on seeing how Cayenne
executes the prefetch queries (at least with UNDEFINED_SEMANTICS), I bet it
would be pretty straight forward.  But I haven’t looked at the code enough
and it might be above my pay grade. ;)

The downside (I think) with the way WOnder implemented it is it just uses
primary keys to find the objects (lots of ORs), so depending on your
database you had to also specify the number of myDataObjects to process at
once.

-Lon

On Tue, Jul 5, 2016 at 11:07 PM, Andrus Adamchik <an...@objectstyle.org>
wrote:

> IIRC in EOF this was "probabilistic", with framework trying to guess which
> other objects' relationships to include in batch fetch. So we'd also need
> to track some kind of "affinity" of root objects between each other.
>
> The first step would be to patch Cayenne to make
> org.apache.cayenne.reflect.FaultFactory injectable (currently it is
> created inside EntityResolver). Then come up with a custom FaultFactory and
> an algorithm for tracking the affinity of faults between each other. And
> make sure it doesn't leak memory :)
>
> Andrus
>
>
> > On Jul 6, 2016, at 12:20 AM, Mike Kienenberger <mk...@gmail.com>
> wrote:
> >
> > That'd be something I'd get some use out of as well.
> >
> > On Tue, Jul 5, 2016 at 5:19 PM, Lon Varscsak <lo...@gmail.com>
> wrote:
> >> I know I’ve asked this before, but I need a batch fetch utility class,
> to
> >> trigger batch fetching of relationships.  I know pre-fetching will do
> this,
> >> but usually when I need it is after the fetch (and I don’t want to
> always
> >> do it even when it’s not needed).
> >>
> >> Anyone have any pointers on how to go about implementing this?
> >>
> >> -Lon
>
>

Re: Batch Fetch

Posted by John Huss <jo...@gmail.com>.
Here is my implementation of this.  There are a number of restrictions that
limit the usefulness of this, but some could be removed with some rework.
It's not full featured enough to add to core as is.  Use at your own risk,
no guarantees, etc.


public static List<Persistent> batchFetch(Collection<? extends Persistent>
sourceObjects, Property<?> path) {
List<List<Persistent>> result = batchFetch(sourceObjects,
(List)Collections.singletonList(path));
if (!result.isEmpty()) return result.get(0);
return Collections.emptyList();
}

/**
 * Like pre-fetching will resolve related objects in large batches rather
than one at a time, but can be done after
 * the initial fetch has already occurred.
 *
 * The related objects must have only a single PK column. This restriction
could be removed.
 *
 * This will issue a single query with a single IN expression for each
path, so there may be limitations
 * on the number of sourceObjects it supports fetching at once, like 1024
or less. This restriction could be removed.
 *
 * For to-many relationships there are restrictions: it doesn't work for
nested paths or relationships without an inverse.
 *
 * @param sourceObjects
 * @param paths
 * @return all the objects that had to be fetched (this excludes objects
that were not hollow to start with)
 */
public static List<List<Persistent>> batchFetch(Collection<? extends
Persistent> sourceObjects, Collection<? extends Property<? extends
Persistent>> paths) {
if (sourceObjects.isEmpty()) return Collections.emptyList();
sourceObjects = new ArrayList<Persistent>(new
HashSet<Persistent>(sourceObjects)); // these have to be unique for the
logic to work below
ObjectContext context = sourceObjects.iterator().next().getObjectContext();
List<List<Persistent>> result = new
ArrayList<List<Persistent>>(paths.size());
List<ObjEntity> entitiesForPaths = new ArrayList<ObjEntity>(paths.size());
ObjEntity objEntity = Cayenne.getObjEntity(sourceObjects.iterator().next());

for (Property<? extends Persistent> path : paths) {
ObjRelationship relationship = objEntity.getRelationship(path.getName());
if (relationship == null) continue;
if (relationship.isToMany()) { // TODO: this doesn't work for paths or
relationships without an inverse
String reverseName = relationship.getReverseRelationshipName();
if (reverseName == null) continue;

SelectQuery<Persistent> query = new
SelectQuery<Persistent>(relationship.getTargetEntity(),
ExpressionFactory.inExp(reverseName, sourceObjects));

List<Persistent> matches = context.performQuery(query);
result.add(matches);
for (Persistent source : sourceObjects) {
DataObject sourceObject = (DataObject)source;
ToManyList toManyList = new ToManyList(sourceObject, path.getName());
List<Persistent> relatedObjects = new
ArrayList<>(ExpressionFactory.matchExp(reverseName,
sourceObject).filterObjects(matches));
matches.removeAll(relatedObjects);
toManyList.setValueDirectly(relatedObjects);
sourceObject.writePropertyDirectly(path.getName(), toManyList);
}
} else {
List<? extends Persistent> related = path.getFromAll(sourceObjects);
related = new ArrayList<Persistent>(new HashSet<Persistent>(related)); //
remove duplicates
// remove objects that aren't hollow (probably none)
Iterator<? extends Persistent> iterator = related.iterator();
while (iterator.hasNext()) {
Persistent relatedObject = iterator.next();
if (relatedObject == null || relatedObject.getPersistenceState() !=
PersistenceState.HOLLOW) {
iterator.remove();
}
}
if (!related.isEmpty()) {
ObjEntity relatedEntity = Cayenne.getObjEntity(related.get(0));
entitiesForPaths.add(relatedEntity);
List<String> primaryKeyNames = new
ArrayList<String>(relatedEntity.getPrimaryKeyNames());
if (primaryKeyNames.size() != 1) {
throw new IllegalArgumentException("Cannot batch fetch. " +
relatedEntity.getName() + " has multiple primary keys columns.");
}
String pk = primaryKeyNames.get(0);
SelectQuery<Persistent> query = new SelectQuery<Persistent>(relatedEntity,
ExpressionFactory.inDbExp(pk, related));
result.add(context.performQuery(query));
}
}
}
return result;
}

On Tue, Oct 4, 2016 at 11:00 AM Lon Varscsak <lo...@gmail.com> wrote:

> So, I finally got back around to this (still no solution).  What you’re
> describing isn’t really what I’m after.  What I’d like to do is take an
> object (or array of objects) and a list of keys and have Cayenne basically
> do the same fetching it does with it’s pre-fetching code.
>
> So when I use addPrefetch to a query, by default it will execute a fetch
> for each of the relationships.  This allows me to decide at the application
> level what keys are important to me.  So something like this:
>
> CayenneUtilities.batchFetch(myDataObjects, “toOne”, “toMany”). It would
> execute a query to find all the objects in “toOne” for all the objects in
> myDataObjects (and the same for “toMany”).  Based on seeing how Cayenne
> executes the prefetch queries (at least with UNDEFINED_SEMANTICS), I bet it
> would be pretty straight forward.  But I haven’t looked at the code enough
> and it might be above my pay grade. ;)
>
> The downside (I think) with the way WOnder implemented it is it just uses
> primary keys to find the objects (lots of ORs), so depending on your
> database you had to also specify the number of myDataObjects to process at
> once.
>
> -Lon
>
> On Tue, Jul 5, 2016 at 11:07 PM, Andrus Adamchik <an...@objectstyle.org>
> wrote:
>
> > IIRC in EOF this was "probabilistic", with framework trying to guess
> which
> > other objects' relationships to include in batch fetch. So we'd also need
> > to track some kind of "affinity" of root objects between each other.
> >
> > The first step would be to patch Cayenne to make
> > org.apache.cayenne.reflect.FaultFactory injectable (currently it is
> > created inside EntityResolver). Then come up with a custom FaultFactory
> and
> > an algorithm for tracking the affinity of faults between each other. And
> > make sure it doesn't leak memory :)
> >
> > Andrus
> >
> >
> > > On Jul 6, 2016, at 12:20 AM, Mike Kienenberger <mk...@gmail.com>
> > wrote:
> > >
> > > That'd be something I'd get some use out of as well.
> > >
> > > On Tue, Jul 5, 2016 at 5:19 PM, Lon Varscsak <lo...@gmail.com>
> > wrote:
> > >> I know I’ve asked this before, but I need a batch fetch utility class,
> > to
> > >> trigger batch fetching of relationships.  I know pre-fetching will do
> > this,
> > >> but usually when I need it is after the fetch (and I don’t want to
> > always
> > >> do it even when it’s not needed).
> > >>
> > >> Anyone have any pointers on how to go about implementing this?
> > >>
> > >> -Lon
> >
> >
>

Re: Batch Fetch

Posted by Andrew Lindesay <ap...@lindesay.co.nz>.
Hello Lon;

Unfortunately my current work does not use Cayenne so I am thinking back
a wee bit, but I think you might be able to do what you want by using
EJBQL.  Look for "Similar Behaviours Using EJBQL" here;

http://cayenne.apache.org/docs/4.0/cayenne-guide/performance-tuning.html#prefetching

Regards;

-- 
Andrew Lindesay
www.silvereye.co.nz

On Wed, 5 Oct 2016, at 05:02, Lon Varscsak wrote:
> Hmmm…but I guess there wouldn’t be another way to do it unless you knew
> the
> original query.
> 
> On Tue, Oct 4, 2016 at 8:59 AM, Lon Varscsak <lo...@gmail.com>
> wrote:
> 
> > So, I finally got back around to this (still no solution).  What you’re
> > describing isn’t really what I’m after.  What I’d like to do is take an
> > object (or array of objects) and a list of keys and have Cayenne basically
> > do the same fetching it does with it’s pre-fetching code.
> >
> > So when I use addPrefetch to a query, by default it will execute a fetch
> > for each of the relationships.  This allows me to decide at the application
> > level what keys are important to me.  So something like this:
> >
> > CayenneUtilities.batchFetch(myDataObjects, “toOne”, “toMany”). It would
> > execute a query to find all the objects in “toOne” for all the objects in
> > myDataObjects (and the same for “toMany”).  Based on seeing how Cayenne
> > executes the prefetch queries (at least with UNDEFINED_SEMANTICS), I bet it
> > would be pretty straight forward.  But I haven’t looked at the code enough
> > and it might be above my pay grade. ;)
> >
> > The downside (I think) with the way WOnder implemented it is it just uses
> > primary keys to find the objects (lots of ORs), so depending on your
> > database you had to also specify the number of myDataObjects to process at
> > once.
> >
> > -Lon
> >
> > On Tue, Jul 5, 2016 at 11:07 PM, Andrus Adamchik <an...@objectstyle.org>
> > wrote:
> >
> >> IIRC in EOF this was "probabilistic", with framework trying to guess
> >> which other objects' relationships to include in batch fetch. So we'd also
> >> need to track some kind of "affinity" of root objects between each other.
> >>
> >> The first step would be to patch Cayenne to make
> >> org.apache.cayenne.reflect.FaultFactory injectable (currently it is
> >> created inside EntityResolver). Then come up with a custom FaultFactory and
> >> an algorithm for tracking the affinity of faults between each other. And
> >> make sure it doesn't leak memory :)
> >>
> >> Andrus
> >>
> >>
> >> > On Jul 6, 2016, at 12:20 AM, Mike Kienenberger <mk...@gmail.com>
> >> wrote:
> >> >
> >> > That'd be something I'd get some use out of as well.
> >> >
> >> > On Tue, Jul 5, 2016 at 5:19 PM, Lon Varscsak <lo...@gmail.com>
> >> wrote:
> >> >> I know I’ve asked this before, but I need a batch fetch utility class,
> >> to
> >> >> trigger batch fetching of relationships.  I know pre-fetching will do
> >> this,
> >> >> but usually when I need it is after the fetch (and I don’t want to
> >> always
> >> >> do it even when it’s not needed).
> >> >>
> >> >> Anyone have any pointers on how to go about implementing this?
> >> >>
> >> >> -Lon
> >>
> >>
> >

Re: Batch Fetch

Posted by Lon Varscsak <lo...@gmail.com>.
Hmmm…but I guess there wouldn’t be another way to do it unless you knew the
original query.

On Tue, Oct 4, 2016 at 8:59 AM, Lon Varscsak <lo...@gmail.com> wrote:

> So, I finally got back around to this (still no solution).  What you’re
> describing isn’t really what I’m after.  What I’d like to do is take an
> object (or array of objects) and a list of keys and have Cayenne basically
> do the same fetching it does with it’s pre-fetching code.
>
> So when I use addPrefetch to a query, by default it will execute a fetch
> for each of the relationships.  This allows me to decide at the application
> level what keys are important to me.  So something like this:
>
> CayenneUtilities.batchFetch(myDataObjects, “toOne”, “toMany”). It would
> execute a query to find all the objects in “toOne” for all the objects in
> myDataObjects (and the same for “toMany”).  Based on seeing how Cayenne
> executes the prefetch queries (at least with UNDEFINED_SEMANTICS), I bet it
> would be pretty straight forward.  But I haven’t looked at the code enough
> and it might be above my pay grade. ;)
>
> The downside (I think) with the way WOnder implemented it is it just uses
> primary keys to find the objects (lots of ORs), so depending on your
> database you had to also specify the number of myDataObjects to process at
> once.
>
> -Lon
>
> On Tue, Jul 5, 2016 at 11:07 PM, Andrus Adamchik <an...@objectstyle.org>
> wrote:
>
>> IIRC in EOF this was "probabilistic", with framework trying to guess
>> which other objects' relationships to include in batch fetch. So we'd also
>> need to track some kind of "affinity" of root objects between each other.
>>
>> The first step would be to patch Cayenne to make
>> org.apache.cayenne.reflect.FaultFactory injectable (currently it is
>> created inside EntityResolver). Then come up with a custom FaultFactory and
>> an algorithm for tracking the affinity of faults between each other. And
>> make sure it doesn't leak memory :)
>>
>> Andrus
>>
>>
>> > On Jul 6, 2016, at 12:20 AM, Mike Kienenberger <mk...@gmail.com>
>> wrote:
>> >
>> > That'd be something I'd get some use out of as well.
>> >
>> > On Tue, Jul 5, 2016 at 5:19 PM, Lon Varscsak <lo...@gmail.com>
>> wrote:
>> >> I know I’ve asked this before, but I need a batch fetch utility class,
>> to
>> >> trigger batch fetching of relationships.  I know pre-fetching will do
>> this,
>> >> but usually when I need it is after the fetch (and I don’t want to
>> always
>> >> do it even when it’s not needed).
>> >>
>> >> Anyone have any pointers on how to go about implementing this?
>> >>
>> >> -Lon
>>
>>
>