You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by PeterKerk <ve...@hotmail.com> on 2010/11/12 12:32:35 UTC

full text search in multiple fields

I want to provide a full text search function.

This function has to search through the 2 fields: "title" and "description"
that I have defined in my schema.xml (both of type "string").

Now, since solr doesnt (by default) provide an or operator, I thought I
should somehow combine these fields into 1 field and THEN search that single
field.

Is that correct and if so, how would I do that?

Thanks!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1888328.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
@Erick: Nope, those fields indeed arent chainable, I used iorixxx's solution
and now it works. :)
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1903486.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Erick Erickson <er...@gmail.com>.
nowhere (unless I overlooked it) do you ever populate city_search
in the first place, it's simply defined..

Also, I don't think (but check it) that <copyField> is chainable.
I don't *think* that
<copyField source="city" dest="city_search"/
<copyField source="city_search" dest = "citytext_search" />

will populate citytext_search. Ahmet's suggestion to do two
<copyField>s with source="city" is spot-on....

Best
Erick

On Sun, Nov 14, 2010 at 5:48 PM, Ahmet Arslan <io...@yahoo.com> wrote:

>
> > but I dont understand why its not indexed.
>
> Probably something wrong with data-config.xml.
>
> > So you can see, that the city field DOES index some data,
> > whereas the
> > city_search and citytext_search have NO data at all...
>
> Then populate these two fields from city via copyField. It is 100% legal.
>
> <copyField source="city" dest="city_search"/>
> <copyField source="city" dest="citytext_search"/>
>
>
>
>

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> but I dont understand why its not indexed.

Probably something wrong with data-config.xml.

> So you can see, that the city field DOES index some data,
> whereas the
> city_search and citytext_search have NO data at all...

Then populate these two fields from city via copyField. It is 100% legal.

<copyField source="city" dest="city_search"/>
<copyField source="city" dest="citytext_search"/>


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Ok, that makes sense ;)

but I dont understand why its not indexed.
IMO, I've defined the "city_search" field the exact same as "city" in the
schema.xml:
<field name="city" type="string" indexed="true" stored="true"/>
<field name="city_search" type="string" indexed="true" stored="true"/>

<copyField source="city_search" dest="citytext_search"/>


So I checked the schema.jsp you suggested.

When under fields I click on the respective fields, I get this output:

Field: city
Field Type: string
Properties: Indexed, Stored, Omit Norms, Sort Missing Last
Schema: Indexed, Stored, Omit Norms, Sort Missing Last
Index: Indexed, Stored, Omit Norms
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Docs: 7
Distinct: 6


Field: city_search
Field Type: string
Properties: Indexed, Stored, Omit Norms, Sort Missing Last
Copied Into: citytext_search
Index Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer
Query Analyzer: org.apache.solr.schema.FieldType$DefaultAnalyzer 


Field: citytext_search
Field Type: text
Properties: Indexed, Tokenized, Stored
Copied From: city_search
Position Increment Gap: 100
Index Analyzer: org.apache.solr.analysis.TokenizerChain Details
Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
Filters:
   1. org.apache.solr.analysis.StopFilterFactory args:{words:
stopwords_dutch.txt ignoreCase: true luceneMatchVersion: LUCENE_24 }
   2. org.apache.solr.analysis.WordDelimiterFilterFactory
args:{splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1
luceneMatchVersion: LUCENE_24 generateWordParts: 1 catenateAll: 0
catenateNumbers: 1 }
   3. org.apache.solr.analysis.LowerCaseFilterFactory
args:{luceneMatchVersion: LUCENE_24 }
   4. org.apache.solr.analysis.EnglishPorterFilterFactory args:{protected:
protwords.txt luceneMatchVersion: LUCENE_24 }
   5. org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory
args:{luceneMatchVersion: LUCENE_24 }
Query Analyzer: org.apache.solr.analysis.TokenizerChain Details


So you can see, that the city field DOES index some data, whereas the
city_search and citytext_search have NO data at all...

This debugging has confirmed that no data is indexed, but to me doesnt
provide any more info on what I did wrong....

Do you have any suggestion?

-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1901551.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
 
> both queries give me 0 results...

Then your field(s) is not populated. You can debug on /admin/dataimport.jsp
or /admin/schema.jsp


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
both queries give me 0 results...
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1900648.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.

--- On Sun, 11/14/10, PeterKerk <ve...@hotmail.com> wrote:

> From: PeterKerk <ve...@hotmail.com>
> Subject: Re: full text search in multiple fields
> To: solr-user@lucene.apache.org
> Date: Sunday, November 14, 2010, 8:52 PM
> 
> Ok, thanks. it works now for title and description fields.
> :)
> 
> But now I also need it for the city. And I cant get that to
> work, even
> though im doing the exact same (or so I think).
> 
> I now have the code below for the city field. 
> (Im defining city field twice in my data-config and
> schema.xml but thats
> because I want the city field to be indexed both as string
> (whole value) and
> as text. Though thats not the point now.)
> 
> data-config.xml
> <field name="city" column="CITY" />
> <field name="city_search" column="CITY" />
> 
> 
> schema.xml
> <field name="city" type="string" indexed="true"
> stored="true"/>
> <field name="city_search" type="text_ws" indexed="true"
> stored="true"/>  
> <!-- tried type "text", "text_ws" and "string" -->
> <field name="citytext_search" type="text" indexed="true"
> stored="true"/>
> <copyField source="city_search"
> dest="citytext_search"/>
> 
> URL:
> http://localhost:8983/solr/db/select/?q=amsterdam&defType=dismax&qf=citytext_search^10.0
> 
> The value in the db for the city field is "amsterdam"

Everything seems fine. What happens when you do these two queries?

8983/solr/db/select/?q=citytext_search:amsterdam&defType=lucene
8983/solr/db/select/?q=citytext_search:[* TO *]&defType=lucene





      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Ok, thanks. it works now for title and description fields. :)

But now I also need it for the city. And I cant get that to work, even
though im doing the exact same (or so I think).

I now have the code below for the city field. 
(Im defining city field twice in my data-config and schema.xml but thats
because I want the city field to be indexed both as string (whole value) and
as text. Though thats not the point now.)

data-config.xml
<field name="city" column="CITY" />
<field name="city_search" column="CITY" />


schema.xml
<field name="city" type="string" indexed="true" stored="true"/>
<field name="city_search" type="text_ws" indexed="true" stored="true"/>  
<!-- tried type "text", "text_ws" and "string" -->
<field name="citytext_search" type="text" indexed="true" stored="true"/>
<copyField source="city_search" dest="citytext_search"/>

URL:
http://localhost:8983/solr/db/select/?q=amsterdam&defType=dismax&qf=citytext_search^10.0

The value in the db for the city field is "amsterdam"

but no results are found. and yes: restarted server, reloaded data-config,
did a full import.
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1900535.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> I checked the url: http://wiki.apache.org/solr/DisMaxQParserPlugin
> 
> When I execute this url on my local machine:
> http://localhost:8983/solr/select/?q=video&qt=defType=dismax&qf=features^20.0+text^0.3
> 
> I get the error: unknown handler: defType=dismax
> 
> So where can I download that handler and how should I
> include it in my
> schema.xml?

Your have a syntax error. Just use &defType=dismax .

http://localhost:8983/solr/select/?q=video&defType=dismax&qf=features^20.0+text^0.3


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
All helpful responses, so thank you for that.

I checked the url: http://wiki.apache.org/solr/DisMaxQParserPlugin

When I execute this url on my local machine:
http://localhost:8983/solr/select/?q=video&qt=defType=dismax&qf=features^20.0+text^0.3

I get the error: unknown handler: defType=dismax

So where can I download that handler and how should I include it in my
schema.xml?

Thanks!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1893625.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Erick Erickson <er...@gmail.com>.
In addition to the other replies, do be careful about "string" types. It's
probably not what you want as it indexes the entire input as a single
token. For instance, indexing "great expectations" as a string type
would NOT get you a hit when searching for "great". Think about
a text type instead...

And you can certainly construct queries like q=title:great AND
description:great

Or the dismax would work for you as others suggested.

See defaultOperator in schema.xml, it is applied automatically unless
you override.

Best
Erick

On Fri, Nov 12, 2010 at 6:32 AM, PeterKerk <ve...@hotmail.com> wrote:

>
> I want to provide a full text search function.
>
> This function has to search through the 2 fields: "title" and "description"
> that I have defined in my schema.xml (both of type "string").
>
> Now, since solr doesnt (by default) provide an or operator, I thought I
> should somehow combine these fields into 1 field and THEN search that
> single
> field.
>
> Is that correct and if so, how would I do that?
>
> Thanks!
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p1888328.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
whoops :)
It was directed at iorixxx, in the first post before me
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2079581.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Dennis Gearon <ge...@sbcglobal.net>.
For those of us who come late to a thread, having at least the last post that 
you're replying to would help. Me at least ;-)

 Dennis Gearon


Signature Warning
----------------
It is always a good idea to learn from your own mistakes. It is usually a better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



----- Original Message ----
From: PeterKerk <ve...@hotmail.com>
To: solr-user@lucene.apache.org
Sent: Sun, December 12, 2010 1:47:35 PM
Subject: Re: full text search in multiple fields


I went for the * operator, and it works now! Thanks!
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2075140.html

Sent from the Solr - User mailing list archive at Nabble.com.


Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
I went for the * operator, and it works now! Thanks!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2075140.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Oeps, sloppy, was a copy paste error.

I now have: 

WORKING:
http://localhost:8983/solr/db/select/?indent=on&q=title_search:Pappegay&defType=lucene&fl=id,title

NOT WORKING:
http://localhost:8983/solr/db/select/?indent=on&q=title_search:Pappegay*&defType=lucene&fl=id,title
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2134044.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> 
> When I do:
> &q=title_search:Pappegay*&defType=lucene&q=*:*&fl=id,title
> 
> nothing is found.
> 
> but if I do:
> &q=title_search:Pappegay&defType=lucene&q=*:*&fl=id,title
> 
> the location IS found.
> 
> I do need a wildcard though, since users may also search on
> parts of the
> title (as described earlier in this post). But this looks
> almost as if the
> location is not found if the wildcard is on the end and the
> searched string
> is no longer than the position of the wildcard(if that
> makes sense :)

Why are you using two q parameters in your search URL? &q=*:*&q=title_search:Pappegay*


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
@iorixxx: removing that line did solve the problem, thanks!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2138629.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Erick Erickson <er...@gmail.com>.
My first guess: You've got some sort of stemming going on at index time so
tuinkamer is getting indexed as tuinkam or something. To find out, look
at you admin page, the "schema browser".

Another interesting page is admin/analysis, which can show you what happens
at each step of the indexing process (check the debug checkbox). Be a little
cautious with wildcards in the query though, the output may be a little
misleading.

You might try getting a copy of Luke to examine your index and see what's
actually in there. Often problems like this are a result of thinking what
actually got indexed is different than what actually was indexed.

Finally, you can use the &debugQuery=on to examine the query, although in
this particular case I don't think it would have helped.

Best
Erick

On Thu, Dec 23, 2010 at 2:20 PM, PeterKerk <ve...@hotmail.com> wrote:

>
> Sorry to bother you again, but it still doesnt seem to work all the time...
>
> This (what you solved earlier) works:
> &q=title_search:Pappegay&defType=lucene&fl=id,title
>
>
> But for another location, which value in DB is: "de tuinkamer"
>
> When I query the id of that location:
> &q=id:431&fl=id,title
> the location is found, so it IS indexed...
>
>
> But this query DOESNT work:
>
> &q=title_search:tuinkamer*&defType=lucene&fl=id,title
>
> And this one DOES:
> &q=title_search:tuin*&defType=lucene&fl=id,title
>
> for me this is unexpected...what can it be?
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2137983.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> But for another location, which value in DB is: "de
> tuinkamer"
> 
> When I query the id of that location:
> &q=id:431&fl=id,title
> the location is found, so it IS indexed...
> 
> 
> But this query DOESNT work:
> 
> &q=title_search:tuinkamer*&defType=lucene&fl=id,title
> 
> And this one DOES:
> &q=title_search:tuin*&defType=lucene&fl=id,title
> 
> for me this is unexpected...what can it be?

As you can verify from /solr/admin/analysis.jsp, tuinkamer is reduced to tuinkam by EnglishPorterFilterFactory. So it expected/normal that &q=title_search:tuinkamer* won't return that document.  Remember tuinkamer* is not analyzed and tested against "what is indexed". That said, if you plan using wildcards, remove EnglishPorterFilterFactory from your analyzers.



      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Sorry to bother you again, but it still doesnt seem to work all the time...

This (what you solved earlier) works:
&q=title_search:Pappegay&defType=lucene&fl=id,title


But for another location, which value in DB is: "de tuinkamer"

When I query the id of that location:
&q=id:431&fl=id,title
the location is found, so it IS indexed...


But this query DOESNT work:

&q=title_search:tuinkamer*&defType=lucene&fl=id,title

And this one DOES:
&q=title_search:tuin*&defType=lucene&fl=id,title

for me this is unexpected...what can it be?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2137983.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Correct! Thanks again, it now works! :)
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2137284.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> 
> When I do:
> &q=title_search:Pappegay*&defType=lucene&q=*:*&fl=id,title
> 
> nothing is found.
> 

This is expected since you have lowercase filter in your index analyzer. Wildcard searches are not analyzed. So you need to lowercase your query on client side. &q=title_search:pappegay*&defType=lucene&fl=id,title


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Mmmm, this is strange:

When I do:
&q=title_search:Pappegay*&defType=lucene&q=*:*&fl=id,title

nothing is found.

but if I do:
&q=title_search:Pappegay&defType=lucene&q=*:*&fl=id,title

the location IS found.

I do need a wildcard though, since users may also search on parts of the
title (as described earlier in this post). But this looks almost as if the
location is not found if the wildcard is on the end and the searched string
is no longer than the position of the wildcard(if that makes sense :)
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2133991.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> 
> The name of the location in the database is:
> "Museumrestaurant De Pappegay"

What was the wildcard query for this?




      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Ok, I was trying to hide the actual name of the location, because I dont want
it to get indexed by search engines AND its a bit of a weird name :p

The name of the location in the database is: "Museumrestaurant De Pappegay"

Anyway, here it is, I executed the queries you gave me, and this is the
result:

DOC FOUND:
http://localhost:8983/solr/db/select/?indent=on&facet=true&sort=membervalue%20desc&sort=location_rating%20desc&q=title_search:%22pappegay%22&defType=lucene&fl=title,title_search
http://localhost:8983/solr/db/select/?indent=on&facet=true&sort=membervalue%20desc&sort=location_rating%20desc&q=title_search:%22Pappegay%22&defType=lucene&fl=title,title_search

http://localhost:8983/solr/db/select/?indent=on&facet=true&sort=membervalue%20desc&sort=location_rating%20desc&q=title:%22Pappegay%22&defType=lucene&fl=title,title_search

NO DOC FOUND:
http://localhost:8983/solr/db/select/?indent=on&facet=true&sort=membervalue%20desc&sort=location_rating%20desc&q=title:%22pappegay%22&defType=lucene&fl=title,title_search
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2133915.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> Certainly did!
> Why, are you saying this code is correct as-is?

Yes, the query &q=title_search:hort*&defType=lucene should return documents having "Hortus supremus" in their title field with the configurations you send us.

It should exists somewhere in the result set, if not in the top 10.

Try a few things to make sure your document is indexed.

&q=title_search:"Hortus supremus"&defType=lucene&fl=title,title_search
&q=title:"Hortus supremus"&defType=lucene&fl=title,title_search

Are they returning that document? Or find that document's unique id and query it.


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Certainly did!
Why, are you saying this code is correct as-is?
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2133022.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Jonathan Rochkind <ro...@jhu.edu>.
Did you reindex after you changed your analyzers?

On 12/22/2010 12:57 PM, PeterKerk wrote:
> Hi guys,
>
> There's one more thing to get this code to work as I need I just found
> out...
>
> Im now using:&q=title_search:hort*&defType=lucene
> as iorixxx suggested.
>
> it works good BUT, this query doesnt find results if the title in DB is
> "Hortus supremus"
>
> I tried adding some tokenizers and filters to solve this, what I think is a
> casing issue, but no luck...
>
> below is my code...what am I missing here?
>
> Thanks again!
>
>
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100">
>    <analyzer type="index">
> 	<tokenizer class="solr.WhitespaceTokenizerFactory"/>
> 	
> 	<!-- in this example, we will only use synonyms at query time
> 	<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt"
> ignoreCase="true" expand="false"/>
> 	-->
> 	<filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt"/>
> 	<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1"/>
> 	<filter class="solr.LowerCaseFilterFactory"/>
> 	<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
> 	<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
>    <analyzer type="query">
> 	<tokenizer class="solr.WhitespaceTokenizerFactory"/>
> 	<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
> ignoreCase="true" expand="true"/>
> 	<filter class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords_dutch.txt"/>
> 	<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="1"/>
> 	<filter class="solr.LowerCaseFilterFactory"/>
> 	<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
> 	<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>    </analyzer>
> </fieldType>
>
>
> <field name="title" type="text_ws" indexed="true" stored="true"/>
> <field name="title_search" type="text" indexed="true" stored="true"/>
> <copyField source="title" dest="title_search"/>

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Hi guys,

There's one more thing to get this code to work as I need I just found
out...

Im now using: &q=title_search:hort*&defType=lucene 
as iorixxx suggested.

it works good BUT, this query doesnt find results if the title in DB is
"Hortus supremus"

I tried adding some tokenizers and filters to solve this, what I think is a
casing issue, but no luck...

below is my code...what am I missing here?

Thanks again!


<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
  <analyzer type="index">
	<tokenizer class="solr.WhitespaceTokenizerFactory"/>
	
	<!-- in this example, we will only use synonyms at query time
	<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt"
ignoreCase="true" expand="false"/>
	-->
	<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
	<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="0" splitOnCaseChange="1"/>
	<filter class="solr.LowerCaseFilterFactory"/>
	<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
	<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
  <analyzer type="query">
	<tokenizer class="solr.WhitespaceTokenizerFactory"/>
	<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
	<filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
	<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="0" catenateNumbers="0"
catenateAll="0" splitOnCaseChange="1"/>
	<filter class="solr.LowerCaseFilterFactory"/>
	<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
	<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
  </analyzer>
</fieldType>


<field name="title" type="text_ws" indexed="true" stored="true"/>
<field name="title_search" type="text" indexed="true" stored="true"/>
<copyField source="title" dest="title_search"/>
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2132659.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.
> There's a location with title: hortus rodondendrus
> 
> This location is found using this query:
> http://localhost:8983/solr/db/select/?indent=on&q=hortus&defType=dismax&qf=title_search^20.0
> But not when using this query:
> http://localhost:8983/solr/db/select/?indent=on&q=hort&defType=dismax&qf=title_search^20.0
> 
> So, I believe my title value is not indexed the way I'd
> like it to be
> indexed. I think currently Im indexing it in full words,
> but am not
> tokenizing it per character...if that makes sense :)

> What should I add for this to be indexed in such a way that
> word parts are
> also found?

The question is, do you want to retrieve that document, with the following queries too? h, ho, hor, hort, hortu.

Or is there a special relation between just hortus and hort?

For the former one, you can use * operator, e.g. &q=title_search:hort*&defType=lucene 
Please note that * is not supported by dismax.

For the latter one you can use http://wiki.apache.org/solr/LanguageAnalysis#solr.StemmerOverrideFilterFactory to manually reduce hortus to hort.


      

Re: full text search in multiple fields

Posted by PeterKerk <ve...@hotmail.com>.
Ok, Im back ;)

There's one final thing that needs to be fixed..

Im trying to apply the same logic as on cities, but now for the title of a
location.

There's a location with title: hortus rodondendrus

This location is found using this query:
http://localhost:8983/solr/db/select/?indent=on&q=hortus&defType=dismax&qf=title_search^20.0
But not when using this query:
http://localhost:8983/solr/db/select/?indent=on&q=hort&defType=dismax&qf=title_search^20.0

So, I believe my title value is not indexed the way I'd like it to be
indexed. I think currently Im indexing it in full words, but am not
tokenizing it per character...if that makes sense :)

The fieldtype of title is "text", defined below:

    <fieldType name="text" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
		
        <!-- in this example, we will only use synonyms at query time
        <filter class="solr.SynonymFilterFactory"
synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
        -->
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
ignoreCase="true" expand="true"/>
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords_dutch.txt"/>
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="solr.EnglishPorterFilterFactory"
protected="protwords.txt"/>
        <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
      </analyzer>
    </fieldType>


What should I add for this to be indexed in such a way that word parts are
also found?

Thanks!
-- 
View this message in context: http://lucene.472066.n3.nabble.com/full-text-search-in-multiple-fields-tp1888328p2070528.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: full text search in multiple fields

Posted by Tommaso Teofili <to...@gmail.com>.
Hi,

2010/11/12 PeterKerk <ve...@hotmail.com>

>
> I want to provide a full text search function.
>
> This function has to search through the 2 fields: "title" and "description"
> that I have defined in my schema.xml (both of type "string").
>
> Now, since solr doesnt (by default) provide an or operator,


I don't think this is correct, by default Solr operator is OR, so the query
"foo bar" is actually "text:(foo OR bar)" (provided that text it your
default field.


> I thought I
> should somehow combine these fields into 1 field and THEN search that
> single
> field.
>

I think that DisMaxRequestHandler could be the correct choice, with that you
could write a query like the following:

BASEURL/dismax?q=foo%20bar&qf=title,description

that way you'd got foo and bar searched (with OR logic operator) inside
title and description fields.
Hope this helps,
Tommaso

Re: full text search in multiple fields

Posted by Ahmet Arslan <io...@yahoo.com>.

--- On Fri, 11/12/10, PeterKerk <ve...@hotmail.com> wrote:

> From: PeterKerk <ve...@hotmail.com>
> Subject: full text search in multiple fields
> To: solr-user@lucene.apache.org
> Date: Friday, November 12, 2010, 1:32 PM
> 
> I want to provide a full text search function.
> 
> This function has to search through the 2 fields: "title"
> and "description"
> that I have defined in my schema.xml (both of type
> "string").
> 
> Now, since solr doesnt (by default) provide an or operator,
> I thought I
> should somehow combine these fields into 1 field and THEN
> search that single
> field.

Yes you can do that with copyField. The field(s) should be solr.TextField. But http://wiki.apache.org/solr/DisMaxQParserPlugin can be more suitable in your case.