You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chris Lu <ch...@gmail.com> on 2005/06/11 09:25:25 UTC

DBSight, search on database by Lucene

Hello Lucene developers,

I would like to introduce myself and say thanks to Lucene contributors
and this mailing list.
We have just released DBSight 1.0, which is a J2EE application that can
create a search engine on any relational database.

You can build a vertical search website within hours if your data is in
a JDBC-enabled database.

To demonstrate the search capability, DBSight created a demo search
on 1.7 million CD albums information by freedb.org provided data.
http://search.dbsight.com/

DBSight is a highly configurable platform to create search.
It can crawl your database, create indexes, display search results.
You can customize most of the components, and manage the indexes -- all 
by web interface.

DBSight is built with Lucene for searching, JDBC for crawling, and 
Velocity for rendering.

Will this qualify DBSight to be listed on "Powered By Lucene" wiki page?

Here is a step by step tutorial on how the demo search is created.

Resources:
    Step by step tutorial : 
http://www.dbsight.net/mediawiki/index.php?title=Step_by_step
    Demo Search on freedb.org's data: http://search.dbsight.com/
    Feature List: http://www.dbsight.net/?q=node/34

Chris Lu


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DBSight, search on database by Lucene

Posted by Chris Lu <ch...@gmail.com>.
Thanks, guys!

I have made the changes to the wiki, following Joshua's advice.
It's the cookie/refreshing problem.

Chris Lu

Joshua Slive wrote:

>
>
>
> On Sat, 11 Jun 2005, Erik Hatcher wrote:
>
>>
>> On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:
>>
>>> Thanks.
>>>
>>> Somehow I found the "Powered By" Lucene page is "Immutable Page", 
>>> even if I logged in.
>>> http://wiki.apache.org/jakarta-lucene/PoweredBy
>>
>>
>> Wow, it sure is.  I'm CC'ing infrastructure to find out why this page 
>> is immutable.
>
>
>
> Ughhh... Looks like another caching problem.
>
> A shift- or ctrl-refresh should get you the right thing.  You know if 
> you have the right page if your userid appears in the upper-right.
>
> It seems like technically moin needs to send Vary: cookie, but this would
> completely destroy our ability to cache.
>
> What we want is for anything with a Cookie: header to totally bypass 
> the cache.  I don't know of any way to configure that.
>
> Joshua.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


wiki now sends Vary: Cookie (was Re: DBSight, search on database by Lucene)

Posted by Joshua Slive <jo...@slive.ca>.

Paul Querna wrote:
> Joshua Slive wrote:
>> What we want is for anything with a Cookie: header to totally bypass 
>> the cache.  I don't know of any way to configure that.
> 
> 
> Moin should be sending Cache-Control: Private in these cases, in 
> addition to the Vary: Cookie header.  If they don't they will break with 
> other upstream proxies that we have no control over.  Fixing it so httpd 
> can cache fixes upstream proxies too, so it is the right thing to do.

I've added the Vary: Cookie header.  I believe that even with the 
current naive Vary handling, this should work ok in mod_cache, since it 
won't store any of the logged-in pages due to the Cache-Control headers.
So the non-cookie version should hang around in the cache.

Anyway, I hope this makes things much less confusing for people trying 
to edit the pages.

Joshua.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DBSight, search on database by Lucene

Posted by Paul Querna <pq...@apache.org>.
Joshua Slive wrote:
> 
> 
> 
> On Sat, 11 Jun 2005, Erik Hatcher wrote:
> 
>>
>> On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:
>>
>>> Thanks.
>>>
>>> Somehow I found the "Powered By" Lucene page is "Immutable Page", 
>>> even if I logged in.
>>> http://wiki.apache.org/jakarta-lucene/PoweredBy
>>
>>
>> Wow, it sure is.  I'm CC'ing infrastructure to find out why this page 
>> is immutable.
> 
> 
> 
> Ughhh... Looks like another caching problem.
> 
> A shift- or ctrl-refresh should get you the right thing.  You know if 
> you have the right page if your userid appears in the upper-right.
> 
> It seems like technically moin needs to send Vary: cookie, but this would
> completely destroy our ability to cache.

Not if we applied the patch I sent to dev@httpd on Friday.  It fixes 
mod_disk_cache's handling of Vary: to keep separate copies for each 
combo, instead of only a single copy.

> What we want is for anything with a Cookie: header to totally bypass the 
> cache.  I don't know of any way to configure that.

Moin should be sending Cache-Control: Private in these cases, in 
addition to the Vary: Cookie header.  If they don't they will break with 
other upstream proxies that we have no control over.  Fixing it so httpd 
can cache fixes upstream proxies too, so it is the right thing to do.

-Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DBSight, search on database by Lucene

Posted by Joshua Slive <jo...@slive.ca>.


On Sat, 11 Jun 2005, Erik Hatcher wrote:

>
> On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:
>
>> Thanks.
>> 
>> Somehow I found the "Powered By" Lucene page is "Immutable Page", even if I 
>> logged in.
>> http://wiki.apache.org/jakarta-lucene/PoweredBy
>
> Wow, it sure is.  I'm CC'ing infrastructure to find out why this page is 
> immutable.


Ughhh... Looks like another caching problem.

A shift- or ctrl-refresh should get you the right thing.  You know if you 
have the right page if your userid appears in the upper-right.

It seems like technically moin needs to send Vary: cookie, but this would
completely destroy our ability to cache.

What we want is for anything with a Cookie: header to totally bypass the 
cache.  I don't know of any way to configure that.

Joshua.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DBSight, search on database by Lucene

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jun 11, 2005, at 1:08 PM, Chris Lu wrote:

> Thanks.
>
> Somehow I found the "Powered By" Lucene page is "Immutable Page",  
> even if I logged in.
> http://wiki.apache.org/jakarta-lucene/PoweredBy

Wow, it sure is.  I'm CC'ing infrastructure to find out why this page  
is immutable.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DBSight, search on database by Lucene

Posted by Chris Lu <ch...@gmail.com>.
Thanks.

Somehow I found the "Powered By" Lucene page is "Immutable Page", even 
if I logged in.
http://wiki.apache.org/jakarta-lucene/PoweredBy

Chris Lu

Erik Hatcher wrote:

>
> On Jun 11, 2005, at 3:25 AM, Chris Lu wrote:
>
>> To demonstrate the search capability, DBSight created a demo search
>> on 1.7 million CD albums information by freedb.org provided data.
>> http://search.dbsight.com/
>
>
> Nice job!  Here's the best query:
>
>     <http://www.dbsight.com/dbs/search.do? 
> indexName=freedb&templateName=free&q=%22tim+reynolds%22+-dave>
>
> :)
>
>> Will this qualify DBSight to be listed on "Powered By Lucene" wiki  
>> page?
>
>
> Of course.  The wiki is community maintained, so anyone who has  
> Lucene Inside is welcome to add their project/product there.
>
>     Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: DBSight, search on database by Lucene

Posted by Erik Hatcher <er...@ehatchersolutions.com>.
On Jun 11, 2005, at 3:25 AM, Chris Lu wrote:
> To demonstrate the search capability, DBSight created a demo search
> on 1.7 million CD albums information by freedb.org provided data.
> http://search.dbsight.com/

Nice job!  Here's the best query:

     <http://www.dbsight.com/dbs/search.do? 
indexName=freedb&templateName=free&q=%22tim+reynolds%22+-dave>

:)

> Will this qualify DBSight to be listed on "Powered By Lucene" wiki  
> page?

Of course.  The wiki is community maintained, so anyone who has  
Lucene Inside is welcome to add their project/product there.

     Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: International Stemmers and Character Encoding

Posted by Edwin Mol <ed...@pandora.be>.
Please ignore my previous post, I have solved the problem.
Turned out that my IDE(eclipse) didn't use UTF-8 encoding by default.

Edwin


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


International Stemmers and Character Encoding

Posted by Edwin Mol <ed...@pandora.be>.
I have downloaded the analysers sources from the sandbox area, but for 
every *Stemmer class I'm having compilation problems:
"Invalid Character Constant".
Here is how a code snipper looks like from the DutchtStemmer class:

  /**
   * Substitute ä, ë, ï, ö, ü, á , é, í, ó, ú
   */
  private void substitute(StringBuffer buffer) {
    for (int i = 0; i < buffer.length(); i++) {
      switch (buffer.charAt(i)) {
        case 'ä':
        case 'á':
          {
            buffer.setCharAt(i, 'a');
            break;
          }
        case 'ë':
        case 'é'::

In this example the 'ä' Character causes a problem.

I think the code is messed up because of wrong character encoding of the 
java file.
Does anyone know if I'm correct and more importantly how to solve this 
problem.

Thanks,

Edwin Mol


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org