You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-java@ibatis.apache.org by Andrius Juozapaitis <an...@gmail.com> on 2010/03/16 13:35:57 UTC

integrating lucene with ibatis - advice wanted

Hey guys,

I created a framework I plan to use for a few projects in-house, using
ibatis 3. I enjoy the separation of sql, mapping and domain objects, but I
have a pretty complex domain model (first class objects may have any number
of attributes, hierarchical if neccessary), so writing sql and taking care
of things like sorting and paging gets really hard, and screws up the
performance. I tried using apache lucene to take the actual searches off my
hands, and it works like a charm. I annotate  the domain objects with
@Indexable annotations, and pass the mapper name (which is retrieved from
the spring context), that can query the objects from the db, and index the
properties and attributes of the first class objects accordingly *on the
application startup*. This builds an index in memory, and the lucene queries
just return the IDs of the matching documents, sorted and filtered, and with
convenient things like totalCount for paging implementation. The actual data
can then be retrieved by simply querying by the primary keys from the
relevant tables.
Now, there is the issue of how to synchronize the database state with the
lucene index. I see a few ways to implement this:
1. Mark the mapper methods that are supposed to change the state of the
database with a marker annotation, and create an aspect that intercepts
those calls, and updates the lucene index.
This has a nasty side effect, that if someone changes the data in the db
directly, index won't know about it until the application is restarted, but
technically, I can live with that.
2. Create an aspect, that intercepts certain calls in the Ibatis layer, and
depending on the operation at hand (insert, update, delete), updates the
lucene index. Beats annotating the methods, and is semi-automatic. Not sure
at which layer I should intercept this, though - hence, if there's a
definite place to hang this aspect on, I'm all ears.
3. Create database triggers, that would fire the events on
insert/update/delete using JMS or any other communication mechanism. This is
probably the best way to go (my objects have well defined identities), but
requires a lot of coding, and would kill the performance on update-heavy
operations (imports and such), as the network roundtrip will be required
from app server->db  server->[app server->db server]->app server (square
brackets representing the call of the lucene index update by the trigger),
not to mention the hurdles of setting up JMS producer in postgres
environment...

There are probably other ways to implement this - I would be very grateful
if you guys could share your insights on this.

Thanks in advance,
Andrius

Re: integrating lucene with ibatis - advice wanted

Posted by Andrius Juozapaitis <an...@gmail.com>.
I actually did look at it - but I didn't like the API, and it seemed a bit
heavyweight for me. Simple lucene is enough for my needs, so I didn't add
that much of effort to get it running. Their forums aren't that active on
this topic either. Might take a second look though - thanks!

regards,
Andrius

On Tue, Mar 16, 2010 at 7:33 PM, Ashok Madhavan <as...@gmail.com>wrote:

> did you look at compass ( http://www.compass-project.org/).
>
> as per their documentation hey already have ibatis + lucene covered. (
> maybe it is ibatis 2.x ).
>
> regards
> ashok
>
> On Tue, Mar 16, 2010 at 9:57 AM, Andrius Juozapaitis <an...@gmail.com>
> wrote:
> > Simone,
> > Thanks for the compliments, this functionality is by no means complete,
> and
> > it's currently a part of a reasonably proof-of-concept
> > (smartgwt-gwtrpcdatasource-gwtdispatcher (command pattern for
> gwt)-business
> > logic in spring/jms/aspects/springmvc/spring
> > security/etc-lucene-ibatis-postgres), but I already refactored it to a
> > number of smaller maven modules in a  multi-module maven project. I could
> > probably get rid of most of the persistence altogether and use hibernate,
> > but I still want to use the same domain objects in gwt and spring layers,
> > and hibernate kinda ruins the idea by enhancing the domain objects, so
> you
> > have to fallback to maintaining parallel hierarchies of DTOs and domain
> > objects - you get my drift :)
> > I could definately release at least this part of the project as an
> > opensource extension, no question about that, if only there's enough
> > interest for it, but I would still like to hear more oppinions about the
> > matter, before diving into implementation.
> > regards,
> > Andrius
> >
> > On Tue, Mar 16, 2010 at 5:32 PM, Simone Tripodi <
> simone.tripodi@gmail.com>
> > wrote:
> >>
> >> Hi Andrius,
> >> very nice to meet you and congratulations for the idea, I fount it
> >> brilliant. These are just my 2 cents.
> >> generally speaking, I'd suggest you to implement the solution number
> >> 1: 1 interceptor and 1 annotation, nothing more, clear and reusable
> >> solution;
> >> 2 is also great and easy to implement, but IMHO 1 is less difficult to
> >> understand also by people involved in your project;
> >> I don't agree 3 would be the best way, 1 and 2 are more reusable,
> >> moreover involves more dependencies!
> >>
> >> A small question: do you intend to Open Source it? I'd be very very
> >> interested on having a look on it and it would be an excellent iBatis
> >> 3rd part extension.
> >>
> >> Thanks a lot for sharing your thoughts.
> >> Simo
> >>
> >> http://people.apache.org/~simonetripodi/
> >>
> >>
> >>
> >> On Tue, Mar 16, 2010 at 1:35 PM, Andrius Juozapaitis <
> andriusj@gmail.com>
> >> wrote:
> >> > Hey guys,
> >> > I created a framework I plan to use for a few projects in-house, using
> >> > ibatis 3. I enjoy the separation of sql, mapping and domain objects,
> but
> >> > I
> >> > have a pretty complex domain model (first class objects may have any
> >> > number
> >> > of attributes, hierarchical if neccessary), so writing sql and taking
> >> > care
> >> > of things like sorting and paging gets really hard, and screws up the
> >> > performance. I tried using apache lucene to take the actual searches
> off
> >> > my
> >> > hands, and it works like a charm. I annotate  the domain objects with
> >> > @Indexable annotations, and pass the mapper name (which is retrieved
> >> > from
> >> > the spring context), that can query the objects from the db, and index
> >> > the
> >> > properties and attributes of the first class objects accordingly *on
> the
> >> > application startup*. This builds an index in memory, and the lucene
> >> > queries
> >> > just return the IDs of the matching documents, sorted and filtered,
> and
> >> > with
> >> > convenient things like totalCount for paging implementation. The
> actual
> >> > data
> >> > can then be retrieved by simply querying by the primary keys from the
> >> > relevant tables.
> >> > Now, there is the issue of how to synchronize the database state with
> >> > the
> >> > lucene index. I see a few ways to implement this:
> >> > 1. Mark the mapper methods that are supposed to change the state of
> the
> >> > database with a marker annotation, and create an aspect that
> intercepts
> >> > those calls, and updates the lucene index.
> >> > This has a nasty side effect, that if someone changes the data in the
> db
> >> > directly, index won't know about it until the application is
> restarted,
> >> > but
> >> > technically, I can live with that.
> >> > 2. Create an aspect, that intercepts certain calls in the Ibatis
> layer,
> >> > and
> >> > depending on the operation at hand (insert, update, delete), updates
> the
> >> > lucene index. Beats annotating the methods, and is semi-automatic. Not
> >> > sure
> >> > at which layer I should intercept this, though - hence, if there's a
> >> > definite place to hang this aspect on, I'm all ears.
> >> > 3. Create database triggers, that would fire the events on
> >> > insert/update/delete using JMS or any other communication mechanism.
> >> > This is
> >> > probably the best way to go (my objects have well defined identities),
> >> > but
> >> > requires a lot of coding, and would kill the performance on
> update-heavy
> >> > operations (imports and such), as the network roundtrip will be
> required
> >> > from app server->db  server->[app server->db server]->app server
> (square
> >> > brackets representing the call of the lucene index update by the
> >> > trigger),
> >> > not to mention the hurdles of setting up JMS producer in postgres
> >> > environment...
> >> > There are probably other ways to implement this - I would be very
> >> > grateful
> >> > if you guys could share your insights on this.
> >> > Thanks in advance,
> >> > Andrius
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
> >> For additional commands, e-mail: user-java-help@ibatis.apache.org
> >>
> >
> >
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
> For additional commands, e-mail: user-java-help@ibatis.apache.org
>
>

Re: integrating lucene with ibatis - advice wanted

Posted by Ashok Madhavan <as...@gmail.com>.
did you look at compass ( http://www.compass-project.org/).

as per their documentation hey already have ibatis + lucene covered. (
maybe it is ibatis 2.x ).

regards
ashok

On Tue, Mar 16, 2010 at 9:57 AM, Andrius Juozapaitis <an...@gmail.com> wrote:
> Simone,
> Thanks for the compliments, this functionality is by no means complete, and
> it's currently a part of a reasonably proof-of-concept
> (smartgwt-gwtrpcdatasource-gwtdispatcher (command pattern for gwt)-business
> logic in spring/jms/aspects/springmvc/spring
> security/etc-lucene-ibatis-postgres), but I already refactored it to a
> number of smaller maven modules in a  multi-module maven project. I could
> probably get rid of most of the persistence altogether and use hibernate,
> but I still want to use the same domain objects in gwt and spring layers,
> and hibernate kinda ruins the idea by enhancing the domain objects, so you
> have to fallback to maintaining parallel hierarchies of DTOs and domain
> objects - you get my drift :)
> I could definately release at least this part of the project as an
> opensource extension, no question about that, if only there's enough
> interest for it, but I would still like to hear more oppinions about the
> matter, before diving into implementation.
> regards,
> Andrius
>
> On Tue, Mar 16, 2010 at 5:32 PM, Simone Tripodi <si...@gmail.com>
> wrote:
>>
>> Hi Andrius,
>> very nice to meet you and congratulations for the idea, I fount it
>> brilliant. These are just my 2 cents.
>> generally speaking, I'd suggest you to implement the solution number
>> 1: 1 interceptor and 1 annotation, nothing more, clear and reusable
>> solution;
>> 2 is also great and easy to implement, but IMHO 1 is less difficult to
>> understand also by people involved in your project;
>> I don't agree 3 would be the best way, 1 and 2 are more reusable,
>> moreover involves more dependencies!
>>
>> A small question: do you intend to Open Source it? I'd be very very
>> interested on having a look on it and it would be an excellent iBatis
>> 3rd part extension.
>>
>> Thanks a lot for sharing your thoughts.
>> Simo
>>
>> http://people.apache.org/~simonetripodi/
>>
>>
>>
>> On Tue, Mar 16, 2010 at 1:35 PM, Andrius Juozapaitis <an...@gmail.com>
>> wrote:
>> > Hey guys,
>> > I created a framework I plan to use for a few projects in-house, using
>> > ibatis 3. I enjoy the separation of sql, mapping and domain objects, but
>> > I
>> > have a pretty complex domain model (first class objects may have any
>> > number
>> > of attributes, hierarchical if neccessary), so writing sql and taking
>> > care
>> > of things like sorting and paging gets really hard, and screws up the
>> > performance. I tried using apache lucene to take the actual searches off
>> > my
>> > hands, and it works like a charm. I annotate  the domain objects with
>> > @Indexable annotations, and pass the mapper name (which is retrieved
>> > from
>> > the spring context), that can query the objects from the db, and index
>> > the
>> > properties and attributes of the first class objects accordingly *on the
>> > application startup*. This builds an index in memory, and the lucene
>> > queries
>> > just return the IDs of the matching documents, sorted and filtered, and
>> > with
>> > convenient things like totalCount for paging implementation. The actual
>> > data
>> > can then be retrieved by simply querying by the primary keys from the
>> > relevant tables.
>> > Now, there is the issue of how to synchronize the database state with
>> > the
>> > lucene index. I see a few ways to implement this:
>> > 1. Mark the mapper methods that are supposed to change the state of the
>> > database with a marker annotation, and create an aspect that intercepts
>> > those calls, and updates the lucene index.
>> > This has a nasty side effect, that if someone changes the data in the db
>> > directly, index won't know about it until the application is restarted,
>> > but
>> > technically, I can live with that.
>> > 2. Create an aspect, that intercepts certain calls in the Ibatis layer,
>> > and
>> > depending on the operation at hand (insert, update, delete), updates the
>> > lucene index. Beats annotating the methods, and is semi-automatic. Not
>> > sure
>> > at which layer I should intercept this, though - hence, if there's a
>> > definite place to hang this aspect on, I'm all ears.
>> > 3. Create database triggers, that would fire the events on
>> > insert/update/delete using JMS or any other communication mechanism.
>> > This is
>> > probably the best way to go (my objects have well defined identities),
>> > but
>> > requires a lot of coding, and would kill the performance on update-heavy
>> > operations (imports and such), as the network roundtrip will be required
>> > from app server->db  server->[app server->db server]->app server (square
>> > brackets representing the call of the lucene index update by the
>> > trigger),
>> > not to mention the hurdles of setting up JMS producer in postgres
>> > environment...
>> > There are probably other ways to implement this - I would be very
>> > grateful
>> > if you guys could share your insights on this.
>> > Thanks in advance,
>> > Andrius
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
>> For additional commands, e-mail: user-java-help@ibatis.apache.org
>>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
For additional commands, e-mail: user-java-help@ibatis.apache.org


Re: integrating lucene with ibatis - advice wanted

Posted by Andrius Juozapaitis <an...@gmail.com>.
Simone,

Thanks for the compliments, this functionality is by no means complete, and
it's currently a part of a reasonably proof-of-concept
(smartgwt-gwtrpcdatasource-gwtdispatcher (command pattern for gwt)-business
logic in spring/jms/aspects/springmvc/spring
security/etc-lucene-ibatis-postgres), but I already refactored it to a
number of smaller maven modules in a  multi-module maven project. I could
probably get rid of most of the persistence altogether and use hibernate,
but I still want to use the same domain objects in gwt and spring layers,
and hibernate kinda ruins the idea by enhancing the domain objects, so you
have to fallback to maintaining parallel hierarchies of DTOs and domain
objects - you get my drift :)
I could definately release at least this part of the project as an
opensource extension, no question about that, if only there's enough
interest for it, but I would still like to hear more oppinions about the
matter, before diving into implementation.

regards,
Andrius


On Tue, Mar 16, 2010 at 5:32 PM, Simone Tripodi <si...@gmail.com>wrote:

> Hi Andrius,
> very nice to meet you and congratulations for the idea, I fount it
> brilliant. These are just my 2 cents.
> generally speaking, I'd suggest you to implement the solution number
> 1: 1 interceptor and 1 annotation, nothing more, clear and reusable
> solution;
> 2 is also great and easy to implement, but IMHO 1 is less difficult to
> understand also by people involved in your project;
> I don't agree 3 would be the best way, 1 and 2 are more reusable,
> moreover involves more dependencies!
>
> A small question: do you intend to Open Source it? I'd be very very
> interested on having a look on it and it would be an excellent iBatis
> 3rd part extension.
>
> Thanks a lot for sharing your thoughts.
> Simo
>
> http://people.apache.org/~simonetripodi/
>
>
>
> On Tue, Mar 16, 2010 at 1:35 PM, Andrius Juozapaitis <an...@gmail.com>
> wrote:
> > Hey guys,
> > I created a framework I plan to use for a few projects in-house, using
> > ibatis 3. I enjoy the separation of sql, mapping and domain objects, but
> I
> > have a pretty complex domain model (first class objects may have any
> number
> > of attributes, hierarchical if neccessary), so writing sql and taking
> care
> > of things like sorting and paging gets really hard, and screws up the
> > performance. I tried using apache lucene to take the actual searches off
> my
> > hands, and it works like a charm. I annotate  the domain objects with
> > @Indexable annotations, and pass the mapper name (which is retrieved from
> > the spring context), that can query the objects from the db, and index
> the
> > properties and attributes of the first class objects accordingly *on the
> > application startup*. This builds an index in memory, and the lucene
> queries
> > just return the IDs of the matching documents, sorted and filtered, and
> with
> > convenient things like totalCount for paging implementation. The actual
> data
> > can then be retrieved by simply querying by the primary keys from the
> > relevant tables.
> > Now, there is the issue of how to synchronize the database state with the
> > lucene index. I see a few ways to implement this:
> > 1. Mark the mapper methods that are supposed to change the state of the
> > database with a marker annotation, and create an aspect that intercepts
> > those calls, and updates the lucene index.
> > This has a nasty side effect, that if someone changes the data in the db
> > directly, index won't know about it until the application is restarted,
> but
> > technically, I can live with that.
> > 2. Create an aspect, that intercepts certain calls in the Ibatis layer,
> and
> > depending on the operation at hand (insert, update, delete), updates the
> > lucene index. Beats annotating the methods, and is semi-automatic. Not
> sure
> > at which layer I should intercept this, though - hence, if there's a
> > definite place to hang this aspect on, I'm all ears.
> > 3. Create database triggers, that would fire the events on
> > insert/update/delete using JMS or any other communication mechanism. This
> is
> > probably the best way to go (my objects have well defined identities),
> but
> > requires a lot of coding, and would kill the performance on update-heavy
> > operations (imports and such), as the network roundtrip will be required
> > from app server->db  server->[app server->db server]->app server (square
> > brackets representing the call of the lucene index update by the
> trigger),
> > not to mention the hurdles of setting up JMS producer in postgres
> > environment...
> > There are probably other ways to implement this - I would be very
> grateful
> > if you guys could share your insights on this.
> > Thanks in advance,
> > Andrius
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
> For additional commands, e-mail: user-java-help@ibatis.apache.org
>
>

Re: integrating lucene with ibatis - advice wanted

Posted by Simone Tripodi <si...@gmail.com>.
Hi Andrius,
very nice to meet you and congratulations for the idea, I fount it
brilliant. These are just my 2 cents.
generally speaking, I'd suggest you to implement the solution number
1: 1 interceptor and 1 annotation, nothing more, clear and reusable
solution;
2 is also great and easy to implement, but IMHO 1 is less difficult to
understand also by people involved in your project;
I don't agree 3 would be the best way, 1 and 2 are more reusable,
moreover involves more dependencies!

A small question: do you intend to Open Source it? I'd be very very
interested on having a look on it and it would be an excellent iBatis
3rd part extension.

Thanks a lot for sharing your thoughts.
Simo

http://people.apache.org/~simonetripodi/



On Tue, Mar 16, 2010 at 1:35 PM, Andrius Juozapaitis <an...@gmail.com> wrote:
> Hey guys,
> I created a framework I plan to use for a few projects in-house, using
> ibatis 3. I enjoy the separation of sql, mapping and domain objects, but I
> have a pretty complex domain model (first class objects may have any number
> of attributes, hierarchical if neccessary), so writing sql and taking care
> of things like sorting and paging gets really hard, and screws up the
> performance. I tried using apache lucene to take the actual searches off my
> hands, and it works like a charm. I annotate  the domain objects with
> @Indexable annotations, and pass the mapper name (which is retrieved from
> the spring context), that can query the objects from the db, and index the
> properties and attributes of the first class objects accordingly *on the
> application startup*. This builds an index in memory, and the lucene queries
> just return the IDs of the matching documents, sorted and filtered, and with
> convenient things like totalCount for paging implementation. The actual data
> can then be retrieved by simply querying by the primary keys from the
> relevant tables.
> Now, there is the issue of how to synchronize the database state with the
> lucene index. I see a few ways to implement this:
> 1. Mark the mapper methods that are supposed to change the state of the
> database with a marker annotation, and create an aspect that intercepts
> those calls, and updates the lucene index.
> This has a nasty side effect, that if someone changes the data in the db
> directly, index won't know about it until the application is restarted, but
> technically, I can live with that.
> 2. Create an aspect, that intercepts certain calls in the Ibatis layer, and
> depending on the operation at hand (insert, update, delete), updates the
> lucene index. Beats annotating the methods, and is semi-automatic. Not sure
> at which layer I should intercept this, though - hence, if there's a
> definite place to hang this aspect on, I'm all ears.
> 3. Create database triggers, that would fire the events on
> insert/update/delete using JMS or any other communication mechanism. This is
> probably the best way to go (my objects have well defined identities), but
> requires a lot of coding, and would kill the performance on update-heavy
> operations (imports and such), as the network roundtrip will be required
> from app server->db  server->[app server->db server]->app server (square
> brackets representing the call of the lucene index update by the trigger),
> not to mention the hurdles of setting up JMS producer in postgres
> environment...
> There are probably other ways to implement this - I would be very grateful
> if you guys could share your insights on this.
> Thanks in advance,
> Andrius

---------------------------------------------------------------------
To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
For additional commands, e-mail: user-java-help@ibatis.apache.org


Re: integrating lucene with ibatis - advice wanted

Posted by François Schiettecatte <fs...@gmail.com>.
Andrius

Like Simo I like option 1, it is probably the simplest option. 3 could really ugly and pushes functionality down into the RBDMS which is not great for scaling, also adds overhead to any insert/update/delete operations.

One think you might consider (which I do for a project I am working on) is to set up a very simple queuing system between ibatis and lucene. Doing this will allow you to control the lucene update cycle more closely so you can batch updates which is going to be more efficient, and your application is less likely to stall if lucene goes down for some reason. I like to think of is as buffer between the two which allows me to manage the flow of data between ibatis and lucene more closely.

You might also want to look at Sphinx Search:

	http://www.sphinxsearch.com/

Different model from yours which is a push model (which I prefer), theirs is a pull model.


Cheers

François


On Mar 16, 2010, at 8:35 AM, Andrius Juozapaitis wrote:

> Hey guys,
> 
> I created a framework I plan to use for a few projects in-house, using ibatis 3. I enjoy the separation of sql, mapping and domain objects, but I have a pretty complex domain model (first class objects may have any number of attributes, hierarchical if neccessary), so writing sql and taking care of things like sorting and paging gets really hard, and screws up the performance. I tried using apache lucene to take the actual searches off my hands, and it works like a charm. I annotate  the domain objects with @Indexable annotations, and pass the mapper name (which is retrieved from the spring context), that can query the objects from the db, and index the properties and attributes of the first class objects accordingly *on the application startup*. This builds an index in memory, and the lucene queries just return the IDs of the matching documents, sorted and filtered, and with convenient things like totalCount for paging implementation. The actual data can then be retrieved by simply querying by the primary keys from the relevant tables. 
> Now, there is the issue of how to synchronize the database state with the lucene index. I see a few ways to implement this:
> 1. Mark the mapper methods that are supposed to change the state of the database with a marker annotation, and create an aspect that intercepts those calls, and updates the lucene index. 
> This has a nasty side effect, that if someone changes the data in the db directly, index won't know about it until the application is restarted, but technically, I can live with that. 
> 2. Create an aspect, that intercepts certain calls in the Ibatis layer, and depending on the operation at hand (insert, update, delete), updates the lucene index. Beats annotating the methods, and is semi-automatic. Not sure at which layer I should intercept this, though - hence, if there's a definite place to hang this aspect on, I'm all ears.
> 3. Create database triggers, that would fire the events on insert/update/delete using JMS or any other communication mechanism. This is probably the best way to go (my objects have well defined identities), but requires a lot of coding, and would kill the performance on update-heavy operations (imports and such), as the network roundtrip will be required from app server->db  server->[app server->db server]->app server (square brackets representing the call of the lucene index update by the trigger), not to mention the hurdles of setting up JMS producer in postgres environment... 
> 
> There are probably other ways to implement this - I would be very grateful if you guys could share your insights on this.
> 
> Thanks in advance, 
> Andrius


---------------------------------------------------------------------
To unsubscribe, e-mail: user-java-unsubscribe@ibatis.apache.org
For additional commands, e-mail: user-java-help@ibatis.apache.org