You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Jamie <ja...@stimulussoft.com> on 2009/01/17 00:02:16 UTC
Search Across All Fields
Hi Everyone
I have two queries:
Query 1
======
(attachments:"beauty supply") AND sentdate:[d20081117010000 TO
d20090117235900]
Query 2
======
(priority:beauty attach:beauty score:beauty size:beauty sentdate:beauty
archivedate:beauty receiveddate:beauty from:beauty to:beauty
subject:beauty cc:beauty bcc:beauty deliveredto:beauty flag:beauty
sensitivity:beauty sender:beauty recipient:beauty body:beauty
attachments:beauty attachname:beauty AND priority:supply attach:supply
score:supply size:supply sentdate:supply archivedate:supply
receiveddate:supply from:supply to:supply subject:supply cc:supply
bcc:supply deliveredto:supply flag:supply sensitivity:supply
sender:supply recipient:supply body:supply attachments:supply
attachname:supply) AND sentdate:[d20081117010000 TO d20090117235900]
Query 1 returns 138 results, while Query 2 return 0 result. Any idea
why? The second query is meant to offer the search across all fields,
whereas the first query specifies one field. Is there a better way to
conduct a search across all fields? Am I missing something?
Thanks in advance for your help!
Regards,
Jamie
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Search Across All Fields
Posted by Jamie <ja...@stimulussoft.com>.
Hi Erick
Thanks for the pointer. I dont know how I missed that. Our index sizes
are absolutely huge so its not really practical in putting an all_text
field. It would great if you could introduce a macro or something that
one could use to specify all fields.
Thanks anyway!
Jamie
Erick Erickson wrote:
> I think you forgot a set of parentheses, a close paren right before
> the AND and an open paren right after AND
>
> Depending upon how big your index is, a MUCH easier way to do
> this is to index another field, call it all_text say, and add all your
> terms to that field as well as to the individual one, then search your
> all_text field instead....
>
> Best
> Erick
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Search Across All Fields
Posted by Erick Erickson <er...@gmail.com>.
I think you forgot a set of parentheses, a close paren right before
the AND and an open paren right after AND
Depending upon how big your index is, a MUCH easier way to do
this is to index another field, call it all_text say, and add all your
terms to that field as well as to the individual one, then search your
all_text field instead....
Best
Erick
On Fri, Jan 16, 2009 at 6:02 PM, Jamie <ja...@stimulussoft.com> wrote:
> Hi Everyone
>
> I have two queries:
>
> Query 1
> ======
>
> (attachments:"beauty supply") AND sentdate:[d20081117010000 TO
> d20090117235900]
>
> Query 2
> ======
>
> (priority:beauty attach:beauty score:beauty size:beauty sentdate:beauty
> archivedate:beauty receiveddate:beauty from:beauty to:beauty subject:beauty
> cc:beauty bcc:beauty deliveredto:beauty flag:beauty sensitivity:beauty
> sender:beauty recipient:beauty body:beauty attachments:beauty
> attachname:beauty AND priority:supply attach:supply score:supply size:supply
> sentdate:supply archivedate:supply receiveddate:supply from:supply to:supply
> subject:supply cc:supply bcc:supply deliveredto:supply flag:supply
> sensitivity:supply sender:supply recipient:supply body:supply
> attachments:supply attachname:supply) AND sentdate:[d20081117010000 TO
> d20090117235900]
>
> Query 1 returns 138 results, while Query 2 return 0 result. Any idea why?
> The second query is meant to offer the search across all fields, whereas the
> first query specifies one field. Is there a better way to conduct a search
> across all fields? Am I missing something?
>
> Thanks in advance for your help!
>
> Regards,
>
> Jamie
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Words that need protection from stemming, i.e., protwords.txt
Posted by David Woodward <dw...@loc.gov>.
Hi.
Any good protwords.txt out there?
In a fairly standard solr analyzer chain, we use the English Porter analyzer like so:
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
For most purposes the porter does just fine, but occasionally words come along that really don't work out to well, e.g.,
"maine" is stemmed to "main" - clearly goofing up precision about "Maine" without doing much good for variants of "main".
So - I have an entry for my protwords.txt. What else should go in there?
Thanks for your ideas,
Dave Woodward
Re: Words that need protection from stemming, i.e., protwords.txt
Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Words that need protection from stemming, i.e., protwords.txt
: References: <49...@gmail.com>
: <39...@gmail.com>
: <49...@stimulussoft.com>
: In-Reply-To: <49...@stimulussoft.com>
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is "hidden" in that thread and gets less
attention. It makes following discussions in the mailing list archives
particularly difficult.
See Also: http://en.wikipedia.org/wiki/Thread_hijacking
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Words that need protection from stemming, i.e., protwords.txt
Posted by patrick o'leary <pj...@pjaol.com>.
Porter is a little outdated I've found KStem much better
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem
You'll still need a good protected word list, but KStem is just a little
nicer
On Fri, Jan 16, 2009 at 6:20 PM, David Woodward <dw...@loc.gov> wrote:
> Hi.
>
> Any good protwords.txt out there?
>
> In a fairly standard solr analyzer chain, we use the English Porter
> analyzer like so:
>
> <filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
>
> For most purposes the porter does just fine, but occasionally words come
> along that really don't work out to well, e.g.,
>
> "maine" is stemmed to "main" - clearly goofing up precision about "Maine"
> without doing much good for variants of "main".
>
> So - I have an entry for my protwords.txt. What else should go in there?
>
> Thanks for your ideas,
>
> Dave Woodward
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Words that need protection from stemming, i.e., protwords.txt
Posted by David Woodward <dw...@loc.gov>.
Hi.
Any good protwords.txt out there?
In a fairly standard solr analyzer chain, we use the English Porter analyzer like so:
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
For most purposes the porter does just fine, but occasionally words come along that really don't work out to well, e.g.,
"maine" is stemmed to "main" - clearly goofing up precision about "Maine" without doing much good for variants of "main".
So - I have an entry for my protwords.txt. What else should go in there?
Thanks for your ideas,
Dave Woodward
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Search Across All Fields
Posted by Chris Hostetter <ho...@fucit.org>.
: Subject: Search Across All Fields
: References: <49...@gmail.com>
: <39...@gmail.com>
: In-Reply-To: <39...@gmail.com>
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is "hidden" in that thread and gets less
attention. It makes following discussions in the mailing list archives
particularly difficult.
See Also: http://en.wikipedia.org/wiki/Thread_hijacking
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
RE: Search Across All Fields
Posted by "Zhang, Lisheng" <Li...@BroadVision.com>.
Hi,
Inside (priority:beauty ..) there is an AND,
is that operator what you want?
Best regards, Lisheng
-----Original Message-----
From: Jamie [mailto:jamie@stimulussoft.com]
Sent: Friday, January 16, 2009 3:02 PM
To: java-user@lucene.apache.org
Subject: Search Across All Fields
Hi Everyone
I have two queries:
Query 1
======
(attachments:"beauty supply") AND sentdate:[d20081117010000 TO
d20090117235900]
Query 2
======
(priority:beauty attach:beauty score:beauty size:beauty sentdate:beauty
archivedate:beauty receiveddate:beauty from:beauty to:beauty
subject:beauty cc:beauty bcc:beauty deliveredto:beauty flag:beauty
sensitivity:beauty sender:beauty recipient:beauty body:beauty
attachments:beauty attachname:beauty AND priority:supply attach:supply
score:supply size:supply sentdate:supply archivedate:supply
receiveddate:supply from:supply to:supply subject:supply cc:supply
bcc:supply deliveredto:supply flag:supply sensitivity:supply
sender:supply recipient:supply body:supply attachments:supply
attachname:supply) AND sentdate:[d20081117010000 TO d20090117235900]
Query 1 returns 138 results, while Query 2 return 0 result. Any idea
why? The second query is meant to offer the search across all fields,
whereas the first query specifies one field. Is there a better way to
conduct a search across all fields? Am I missing something?
Thanks in advance for your help!
Regards,
Jamie
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org