You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@couchdb.apache.org by Jan Lehnardt <ja...@apache.org> on 2008/04/03 22:34:33 UTC

Fulltext HTTP API

Heya,

I decided to push forward the HTTP API for fulltext search.

For the sake of KISS I only enabled database wide searching
with NO consideration of views within CouchDB. For now.

You can now query http://couchdb/database?<urlencodedquerystring>
and get a list of matching document ids and scores back
as shown in this screenshot:

http://www.flickr.com/photos/janlehnardt/2385218843/

This is not the end of this development of course. We still
want to be able to use views to specify what to index and
we probably want that to be in CouchDB as per earlier
discussions. My focus here was getting out something
that works instead of empty promises.

One more note: This is only available in the mochiweb
branch (that could use some developer attention *hint*,
there are only a handful of failing tests to fix).

Cheers
Jan
--

Re: Fulltext HTTP API

Posted by Søren Hilmer <sh...@widetrail.dk>.
Okay, it is also just one call.

--Søren
-- 
Søren Hilmer, M.Sc., M.Crypt.
wideTrail            Phone: +45 25481225
Pilevænget 41        Email: sh@widetrail.dk
DK-8961  Allingåbro  Web: www.widetrail.dk

On Fri, April 4, 2008 12:49, Jan Lehnardt wrote:
>
> On Apr 4, 2008, at 12:41, Søren Hilmer wrote:
>>
>>>> I guess I could save a lookup if a views query could be restricted
>>>> based
>>>> on document revision.
>>>
>>> Views always only contain the latest revision of a document.
>>>
>>
>> Yes, but if I could restrict on revision I did not have to lookup the
>> document id.
>>
>> Now I need to,
>> 1) Get the database, I now have the revision.
>> 2) Get the document with that revision, I now have document-id
>> 3) get the view with startkey_docid=<above>, count=1
>>
>> If you could restrict on revision for views, step 2 could be
>> eliminated,
>> an step 3 would be:
>>   get the view with startkey_revision=<above>
>
> Sorry, from how I understand it, this against the design of views, so it
> is not going to happen.
>
> Cheers
> Jan
> --



Re: Fulltext HTTP API

Posted by Jan Lehnardt <ja...@apache.org>.
On Apr 4, 2008, at 12:41, Søren Hilmer wrote:
>
>>> I guess I could save a lookup if a views query could be restricted
>>> based
>>> on document revision.
>>
>> Views always only contain the latest revision of a document.
>>
>
> Yes, but if I could restrict on revision I did not have to lookup the
> document id.
>
> Now I need to,
> 1) Get the database, I now have the revision.
> 2) Get the document with that revision, I now have document-id
> 3) get the view with startkey_docid=<above>, count=1
>
> If you could restrict on revision for views, step 2 could be  
> eliminated,
> an step 3 would be:
>   get the view with startkey_revision=<above>

Sorry, from how I understand it, this against the design of views, so it
is not going to happen.

Cheers
Jan
--

Re: Fulltext HTTP API

Posted by Søren Hilmer <sh...@widetrail.dk>.
On Fri, April 4, 2008 11:55, Jan Lehnardt wrote:
> On Apr 4, 2008, at 10:10, Søren Hilmer wrote:
>> Great!!
>>
>> In relation I am working on indexing using this algorithm:
>>
>> when database change notification is received
>>   1. find document changed
>>   2. for all views defined in the fulltext design document
>>      a. get view with startkey_docid=<docid for changed doc> and
>> count=1
>>      b. if total_rows is 1 then re-index this view for this document
>>
>> Makes sense?
>
> Sounds sensible.

Good

>
>> I guess I could save a lookup if a views query could be restricted
>> based
>> on document revision.
>
> Views always only contain the latest revision of a document.
>

Yes, but if I could restrict on revision I did not have to lookup the
document id.

Now I need to,
 1) Get the database, I now have the revision.
 2) Get the document with that revision, I now have document-id
 3) get the view with startkey_docid=<above>, count=1

If you could restrict on revision for views, step 2 could be eliminated,
an step 3 would be:
   get the view with startkey_revision=<above>

>
>> For searching I need to do something similar, as the best idea I
>> have come
>> up with is to still use the document id in the Lucene index, and
>> then do
>> the same starkey_docid,count=1 for the view, if the return value in
>> the
>> design document is specified to view.
>> For this to make sense the search-api needs to be extended with the
>> view
>> you are searching, or do we want to always search all views defined
>> in the
>> designdoc?
>
> We'd extend the HTTP here, I guess.

I reconned so.

Have fun
  Søren

>
> Cheers
> Jan
> --
>
>>
>>
>> --Søren
>>
>> --
>> Søren Hilmer, M.Sc., M.Crypt.
>> wideTrail            Phone: +45 25481225
>> Pilevænget 41        Email: sh@widetrail.dk
>> DK-8961  Allingåbro  Web: www.widetrail.dk
>>
>> On Thu, April 3, 2008 22:34, Jan Lehnardt wrote:
>>> Heya,
>>>
>>> I decided to push forward the HTTP API for fulltext search.
>>>
>>> For the sake of KISS I only enabled database wide searching
>>> with NO consideration of views within CouchDB. For now.
>>>
>>> You can now query http://couchdb/database?<urlencodedquerystring>
>>> and get a list of matching document ids and scores back
>>> as shown in this screenshot:
>>>
>>> http://www.flickr.com/photos/janlehnardt/2385218843/
>>>
>>> This is not the end of this development of course. We still
>>> want to be able to use views to specify what to index and
>>> we probably want that to be in CouchDB as per earlier
>>> discussions. My focus here was getting out something
>>> that works instead of empty promises.
>>>
>>> One more note: This is only available in the mochiweb
>>> branch (that could use some developer attention *hint*,
>>> there are only a handful of failing tests to fix).
>>>
>>> Cheers
>>> Jan
>>> --
>>>
>>
>>
>>
>
>

-- 
Søren Hilmer, M.Sc., M.Crypt.
wideTrail            Phone: +45 25481225
Pilevænget 41        Email: sh@widetrail.dk
DK-8961  Allingåbro  Web: www.widetrail.dk




Re: Fulltext HTTP API

Posted by Jan Lehnardt <ja...@apache.org>.
On Apr 4, 2008, at 10:10, Søren Hilmer wrote:
> Great!!
>
> In relation I am working on indexing using this algorithm:
>
> when database change notification is received
>   1. find document changed
>   2. for all views defined in the fulltext design document
>      a. get view with startkey_docid=<docid for changed doc> and  
> count=1
>      b. if total_rows is 1 then re-index this view for this document
>
> Makes sense?

Sounds sensible.

> I guess I could save a lookup if a views query could be restricted  
> based
> on document revision.

Views always only contain the latest revision of a document.


> For searching I need to do something similar, as the best idea I  
> have come
> up with is to still use the document id in the Lucene index, and  
> then do
> the same starkey_docid,count=1 for the view, if the return value in  
> the
> design document is specified to view.
> For this to make sense the search-api needs to be extended with the  
> view
> you are searching, or do we want to always search all views defined  
> in the
> designdoc?

We'd extend the HTTP here, I guess.

Cheers
Jan
--

>
>
> --Søren
>
> -- 
> Søren Hilmer, M.Sc., M.Crypt.
> wideTrail            Phone: +45 25481225
> Pilevænget 41        Email: sh@widetrail.dk
> DK-8961  Allingåbro  Web: www.widetrail.dk
>
> On Thu, April 3, 2008 22:34, Jan Lehnardt wrote:
>> Heya,
>>
>> I decided to push forward the HTTP API for fulltext search.
>>
>> For the sake of KISS I only enabled database wide searching
>> with NO consideration of views within CouchDB. For now.
>>
>> You can now query http://couchdb/database?<urlencodedquerystring>
>> and get a list of matching document ids and scores back
>> as shown in this screenshot:
>>
>> http://www.flickr.com/photos/janlehnardt/2385218843/
>>
>> This is not the end of this development of course. We still
>> want to be able to use views to specify what to index and
>> we probably want that to be in CouchDB as per earlier
>> discussions. My focus here was getting out something
>> that works instead of empty promises.
>>
>> One more note: This is only available in the mochiweb
>> branch (that could use some developer attention *hint*,
>> there are only a handful of failing tests to fix).
>>
>> Cheers
>> Jan
>> --
>>
>
>
>


Re: Fulltext HTTP API

Posted by Søren Hilmer <sh...@widetrail.dk>.
Great!!

In relation I am working on indexing using this algorithm:

when database change notification is received
   1. find document changed
   2. for all views defined in the fulltext design document
      a. get view with startkey_docid=<docid for changed doc> and count=1
      b. if total_rows is 1 then re-index this view for this document

Makes sense?

I guess I could save a lookup if a views query could be restricted based
on document revision.

For searching I need to do something similar, as the best idea I have come
up with is to still use the document id in the Lucene index, and then do
the same starkey_docid,count=1 for the view, if the return value in the
design document is specified to view.
For this to make sense the search-api needs to be extended with the view
you are searching, or do we want to always search all views defined in the
designdoc?

--Søren

-- 
Søren Hilmer, M.Sc., M.Crypt.
wideTrail            Phone: +45 25481225
Pilevænget 41        Email: sh@widetrail.dk
DK-8961  Allingåbro  Web: www.widetrail.dk

On Thu, April 3, 2008 22:34, Jan Lehnardt wrote:
> Heya,
>
> I decided to push forward the HTTP API for fulltext search.
>
> For the sake of KISS I only enabled database wide searching
> with NO consideration of views within CouchDB. For now.
>
> You can now query http://couchdb/database?<urlencodedquerystring>
> and get a list of matching document ids and scores back
> as shown in this screenshot:
>
> http://www.flickr.com/photos/janlehnardt/2385218843/
>
> This is not the end of this development of course. We still
> want to be able to use views to specify what to index and
> we probably want that to be in CouchDB as per earlier
> discussions. My focus here was getting out something
> that works instead of empty promises.
>
> One more note: This is only available in the mochiweb
> branch (that could use some developer attention *hint*,
> there are only a handful of failing tests to fix).
>
> Cheers
> Jan
> --
>



Re: Fulltext HTTP API

Posted by Jan Lehnardt <ja...@apache.org>.
On Apr 3, 2008, at 23:14, Noah Slater wrote:
> What specifically do you use in the query string?


That entirely depends on what search engine you
run in the backend and then whatever that understands.
CouchDB just hands it through.

Cheers
Jan
--


Re: Fulltext HTTP API

Posted by Noah Slater <ns...@bytesexual.org>.
What specifically do you use in the query string?

On Thu, Apr 03, 2008 at 10:34:33PM +0200, Jan Lehnardt wrote:
> Heya,
>
> I decided to push forward the HTTP API for fulltext search.
>
> For the sake of KISS I only enabled database wide searching
> with NO consideration of views within CouchDB. For now.
>
> You can now query http://couchdb/database?<urlencodedquerystring>
> and get a list of matching document ids and scores back
> as shown in this screenshot:
>
> http://www.flickr.com/photos/janlehnardt/2385218843/
>
> This is not the end of this development of course. We still
> want to be able to use views to specify what to index and
> we probably want that to be in CouchDB as per earlier
> discussions. My focus here was getting out something
> that works instead of empty promises.
>
> One more note: This is only available in the mochiweb
> branch (that could use some developer attention *hint*,
> there are only a handful of failing tests to fix).
>
> Cheers
> Jan
> --

-- 
Noah Slater <http://bytesexual.org/>