You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by markharw00d <ma...@yahoo.co.uk> on 2005/09/17 02:27:29 UTC
Lucene database bindings
I know there have been some posts discussing how to integrate Lucene
with Derby recently.
I've added an example project that works with both HSQLDB and Derby
here: http://issues.apache.org/jira/browse/LUCENE-434
The bindings allow you to use SQL that mixes database and Lucene
functionality in ways like this:
select top 10 lucene_score(id) as SCORE,
lucene_highlight(adText) from ads
where pricePounds <200 and pricePounds >1
and lucene_query('"drum kit"',id)>0
order by SCORE DESC, pricePounds ASC
See the readme.txt in the zip file for details.
Cheers,
Mark
___________________________________________________________
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene database bindings
Posted by Mag Gam <ma...@gmail.com>.
Mark:
Thanks for looking at this.I will try it out!
On 9/16/05, markharw00d <ma...@yahoo.co.uk> wrote:
>
> I know there have been some posts discussing how to integrate Lucene
> with Derby recently.
>
> I've added an example project that works with both HSQLDB and Derby
> here: http://issues.apache.org/jira/browse/LUCENE-434
>
> The bindings allow you to use SQL that mixes database and Lucene
> functionality in ways like this:
>
> select top 10 lucene_score(id) as SCORE,
> lucene_highlight(adText) from ads
> where pricePounds <200 and pricePounds >1
> and lucene_query('"drum kit"',id)>0
> order by SCORE DESC, pricePounds ASC
>
> See the readme.txt in the zip file for details.
>
> Cheers,
> Mark
>
>
>
>
>
>
>
>
> ___________________________________________________________
> To help you stay safe and secure online, we've developed the all new
> Yahoo! Security Centre. http://uk.security.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Lucene database bindings
Posted by mark harwood <ma...@yahoo.co.uk>.
>>does it deal w/ aggregate functions and group by
>> clauses?
Yes, it is basically *all* the normal SQL
functionality but with the added option to mix in
scores from lucene queries to the criteria.
>From the example code:
select top 10 count(*) as numAds,pricePounds from ads
where pricePounds <500 and lucene_query('table',id)>0
group by pricePounds order by numAds desc
This returns the top 10 most common prices for a table
(as in kitchen table, not SQL table). The database has
classified ad descriptions and prices so there's not
much meaningful to group on. A "category" column
would be a better example for grouping but there isn't
one in the example data.
Cheers,
Mark
___________________________________________________________
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene database bindings
Posted by Ray Tsang <sa...@gmail.com>.
I must admit that I have not downloaded the source yet. But a quick
question, does it deal w/ aggregate functions and group by clauses?
Thanks!
Ray,
On 9/17/05, markharw00d <ma...@yahoo.co.uk> wrote:
> >>Basically your lucene_query function will return a true/false in one
> of the query predicates for each record.
>
> Almost, it returns a score - much more useful than just a boolean and
> the key difference between a search engine and a database (partial
> matching with relevance ranked scores). These can be used to sort
> results by relevance.
>
>
>
> ___________________________________________________________
> To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Lucene database bindings
Posted by markharw00d <ma...@yahoo.co.uk>.
>>Basically your lucene_query function will return a true/false in one
of the query predicates for each record.
Almost, it returns a score - much more useful than just a boolean and
the key difference between a search engine and a database (partial
matching with relevance ranked scores). These can be used to sort
results by relevance.
___________________________________________________________
To help you stay safe and secure online, we've developed the all new Yahoo! Security Centre. http://uk.security.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene database bindings
Posted by Mag Gam <ma...@gmail.com>.
Mark:
VERY VERY good post! Please publish this doc and example.
On 9/17/05, Chris Lu <ch...@gmail.com> wrote:
>
> On 9/17/05, markharw00d <ma...@yahoo.co.uk> wrote:
> > Mag Gam wrote:
> >
> > >Does your example store the index in the derby db or somewhere else? I
> was
> > >thinking of indexing a table in a seperate column.
> > >
> > >
> > The software is not an org.apache.lucene.store.Directory implementation
> > ie an FSDirectory alternative for persisting Lucene data in a relational
> > table.
> > Instead, the software demonstrates a way to extend SQL syntax to allow
> > Lucene queries to run as in-line functions during the database's
> > execution of queries. These hybrid SQL statements can take advantage of
> > the usual databases functions for sorting, grouping joins, conditions,
> > indexes etc but also use Lucene queries and highlighting functions all
> > in the one SQL statement.
> > The Lucene indexes used as part of this can be any standard Directory
> > implementation (eg RAM, FS).
> >
> > The motivation for creating a Lucene/RDBMS hybrid query tool was to
> > address issues commonly associated with using just Lucene:
> > 1) Sorting on float/date fields and associated memory consumption
> > 2) Representing numbers/dates in Lucene (eg having to pad with sufficent
> > leading zeros and add
> > to index's list of terms)
> > 3) Retrieving only certain stored fields from a document (all storage
> > can be done in db)
> > 4) Issues to do with updating *volatile* data eg price data used in
> sorts
> > 5) Manually coding joins with RDBMS content as custom filters
> > 6) Too-many terms exceptions produced by range queries
> > 7) Grouping results eg by website
> > 8) Boosting docs based on stored content eg date
> >
> > These are the sorts of things an RDBMS can help with.
> >
> > Cheers
> > Mark
> >
>
> Mark,
>
> This is really good stuff!
> I have been thinking about it for a long while.
> Thank you for showing us the door!
>
> Basically your lucene_query function will return a true/false in one
> of the query predicates for each record.
> This will be very useful when other query predicates can filter out a
> lot of records.
>
> Is there any hint to give DB to use the lucene_query function last?
>
> Chris Lu
> ------------------------
> Lucene RAD on Any Databases
> http://www.dbsight.net
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Lucene database bindings
Posted by Chris Lu <ch...@gmail.com>.
On 9/17/05, markharw00d <ma...@yahoo.co.uk> wrote:
> Mag Gam wrote:
>
> >Does your example store the index in the derby db or somewhere else? I was
> >thinking of indexing a table in a seperate column.
> >
> >
> The software is not an org.apache.lucene.store.Directory implementation
> ie an FSDirectory alternative for persisting Lucene data in a relational
> table.
> Instead, the software demonstrates a way to extend SQL syntax to allow
> Lucene queries to run as in-line functions during the database's
> execution of queries. These hybrid SQL statements can take advantage of
> the usual databases functions for sorting, grouping joins, conditions,
> indexes etc but also use Lucene queries and highlighting functions all
> in the one SQL statement.
> The Lucene indexes used as part of this can be any standard Directory
> implementation (eg RAM, FS).
>
> The motivation for creating a Lucene/RDBMS hybrid query tool was to
> address issues commonly associated with using just Lucene:
> 1) Sorting on float/date fields and associated memory consumption
> 2) Representing numbers/dates in Lucene (eg having to pad with sufficent
> leading zeros and add
> to index's list of terms)
> 3) Retrieving only certain stored fields from a document (all storage
> can be done in db)
> 4) Issues to do with updating *volatile* data eg price data used in sorts
> 5) Manually coding joins with RDBMS content as custom filters
> 6) Too-many terms exceptions produced by range queries
> 7) Grouping results eg by website
> 8) Boosting docs based on stored content eg date
>
> These are the sorts of things an RDBMS can help with.
>
> Cheers
> Mark
>
Mark,
This is really good stuff!
I have been thinking about it for a long while.
Thank you for showing us the door!
Basically your lucene_query function will return a true/false in one
of the query predicates for each record.
This will be very useful when other query predicates can filter out a
lot of records.
Is there any hint to give DB to use the lucene_query function last?
Chris Lu
------------------------
Lucene RAD on Any Databases
http://www.dbsight.net
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene database bindings
Posted by markharw00d <ma...@yahoo.co.uk>.
Mag Gam wrote:
>Does your example store the index in the derby db or somewhere else? I was
>thinking of indexing a table in a seperate column.
>
>
The software is not an org.apache.lucene.store.Directory implementation
ie an FSDirectory alternative for persisting Lucene data in a relational
table.
Instead, the software demonstrates a way to extend SQL syntax to allow
Lucene queries to run as in-line functions during the database's
execution of queries. These hybrid SQL statements can take advantage of
the usual databases functions for sorting, grouping joins, conditions,
indexes etc but also use Lucene queries and highlighting functions all
in the one SQL statement.
The Lucene indexes used as part of this can be any standard Directory
implementation (eg RAM, FS).
The motivation for creating a Lucene/RDBMS hybrid query tool was to
address issues commonly associated with using just Lucene:
1) Sorting on float/date fields and associated memory consumption
2) Representing numbers/dates in Lucene (eg having to pad with sufficent
leading zeros and add
to index's list of terms)
3) Retrieving only certain stored fields from a document (all storage
can be done in db)
4) Issues to do with updating *volatile* data eg price data used in sorts
5) Manually coding joins with RDBMS content as custom filters
6) Too-many terms exceptions produced by range queries
7) Grouping results eg by website
8) Boosting docs based on stored content eg date
These are the sorts of things an RDBMS can help with.
Cheers
Mark
___________________________________________________________
How much free photo storage do you get? Store your holiday
snaps for FREE with Yahoo! Photos http://uk.photos.yahoo.com
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Lucene database bindings
Posted by Mag Gam <ma...@gmail.com>.
Does your example store the index in the derby db or somewhere else? I was
thinking of indexing a table in a seperate column.
On 9/16/05, markharw00d <ma...@yahoo.co.uk> wrote:
>
> I know there have been some posts discussing how to integrate Lucene
> with Derby recently.
>
> I've added an example project that works with both HSQLDB and Derby
> here: http://issues.apache.org/jira/browse/LUCENE-434
>
> The bindings allow you to use SQL that mixes database and Lucene
> functionality in ways like this:
>
> select top 10 lucene_score(id) as SCORE,
> lucene_highlight(adText) from ads
> where pricePounds <200 and pricePounds >1
> and lucene_query('"drum kit"',id)>0
> order by SCORE DESC, pricePounds ASC
>
> See the readme.txt in the zip file for details.
>
> Cheers,
> Mark
>
>
>
>
>
>
>
>
> ___________________________________________________________
> To help you stay safe and secure online, we've developed the all new
> Yahoo! Security Centre. http://uk.security.yahoo.com
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>