You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@couchdb.apache.org by Manokaran K <ma...@gmail.com> on 2011/11/09 11:01:02 UTC

Couchdb-Lucene 0.8 '-' char to be encoded?

Hi,

I just upgraded my application from Couchdb-Lucene 0.6 to 0.8. I observe
that when I query for docs with text fields with a '-' in it, say,
"2010-2011", I get all documents that has "2010" in it (I get documents
that have "2010-2011" as well as "2009-2010"). It was not so in 0.6! Do I
have to URL encode the '-' character?

thanks in advance,
mano

Re: Couchdb-Lucene 0.8 '-' char to be encoded?

Posted by Manokaran K <ma...@gmail.com>.

On Thu, Nov 10, 2011 at 1:12 AM, Robert Newson <ro...@gmail.com>wrote:

> It's already available in the latest 0.8-SNAPSHOT (there's no 0.8 release
> yet).
>
>
Thanks again.

Re: Couchdb-Lucene 0.8 '-' char to be encoded?

Posted by Robert Newson <ro...@gmail.com>.

It's already available in the latest 0.8-SNAPSHOT (there's no 0.8 release yet).

B.

On 9 November 2011 17:43, Manokaran K <ma...@smartgrader.com> wrote:
> On Wed, Nov 9, 2011 at 4:26 PM, Robert Newson <ro...@gmail.com>wrote:
>
>> Hi,
>>
>> This is probably because 0.8-SNAPSHOT uses a version of Lucene later
>> than 3.1, where the behavior of the standard tokenizer. I've added
>> another tokenizer option called "classic" which gives the same results
>> as the pre-3.1 tokenizer.
>>
>>
> Thanks very much. Is the classic tokenizer already available in 0.8 or are
> you yet to push the changes to github?
>
> regds,
> mano
>

Re: Couchdb-Lucene 0.8 '-' char to be encoded?

Posted by Manokaran K <ma...@smartgrader.com>.

On Wed, Nov 9, 2011 at 4:26 PM, Robert Newson <ro...@gmail.com>wrote:

> Hi,
>
> This is probably because 0.8-SNAPSHOT uses a version of Lucene later
> than 3.1, where the behavior of the standard tokenizer. I've added
> another tokenizer option called "classic" which gives the same results
> as the pre-3.1 tokenizer.
>
>
Thanks very much. Is the classic tokenizer already available in 0.8 or are
you yet to push the changes to github?

regds,
mano

Re: Couchdb-Lucene 0.8 '-' char to be encoded?

Posted by Robert Newson <ro...@gmail.com>.

Hi,

This is probably because 0.8-SNAPSHOT uses a version of Lucene later
than 3.1, where the behavior of the standard tokenizer. I've added
another tokenizer option called "classic" which gives the same results
as the pre-3.1 tokenizer.

B.

On 9 November 2011 10:01, Manokaran K <ma...@gmail.com> wrote:
> Hi,
>
> I just upgraded my application from Couchdb-Lucene 0.6 to 0.8. I observe
> that when I query for docs with text fields with a '-' in it, say,
> "2010-2011", I get all documents that has "2010" in it (I get documents
> that have "2010-2011" as well as "2009-2010"). It was not so in 0.6! Do I
> have to URL encode the '-' character?
>
> thanks in advance,
> mano
>