You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by er...@aftenposten.no on 2005/08/04 12:37:14 UTC

Search in multi fields with cross field AND ?

Hi there!

I'm fairly new to Lucene and trying evaluate it. I want to retrieve only documenets (companies) that has 
all the words searched for in different fields, crossed in different fields with AND search.

I have a structure in the "documents" (companies) like this :

companyName string
keywords1 string        (Extra boosting if match here)
keywords2 string

---

So far so good when I create the index with boosting on keywords1 etc.

If I search for:

"ford garage"

I want companies that rapair ford cars, and only those.

So if you have documents/companies like:

"company1", "ford volvo", "garage motor ...."
"company2", "volvo nissan", "garage ......."
"company3", null, "ford garage motor ....."

---

I will have match with company1 and company3. Company2 don't do ford cars. Company3 haven't match in keywords1, but 
they still doing garage stuff with Ford cars, but should be after company1 in the result list, since match in keywords1 
should give you extra boosting = higher score = why I don't have only one field for both keywords.

So I have tried different approaches doing searching, something like this:

1 ---------------------------------------------------------------------------

        String[] fieldNames = new String[]{"companyName", "keywords1", "keywords2"};
        String searchText = "ford garage" / "ford AND garage"

        BooleanQuery query = new BooleanQuery();

        for(int i = 0; i < fieldNames.length; ++i) {
            QueryParser qp = new QueryParser(fieldNames[i], new StandardAnalyzer());

            qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
            query.add(qp.parse(searchText), false, false);
        }

        Hits h = is.search(query);

2 ---------------------------------------------------------------------------

        Query query = MultiFieldQueryParser.parse(searchText, fieldNames, new StandardAnalyzer());

        Hits h = is.search(query);

------------------------------------------------------------------------------

In both cases, I only get company3 in the result because "ford garage" is within the same field for this company I guess, 
and I don't get Company1 too. I guess because the two words are split in two different fields = search in muliti fields, 
cross field AND..... if you understand ?

So is this possible with Lucene, what have I missed ? I have the Lucene book, can't find anything there, tried to find
anyting in the archive, didn't find anythign there either, but maybe I didn't look good enough ? What should I look for ?

So any clues of how I can do this search ??


Thanks in advanced!


Cheers, 
Erlend 

Re: Search in multi fields with cross field AND ?

Posted by Martin Rode <ma...@programmfabrik.de>.
Erlend,

try this code:

QueryParser q = new QueryParser("text", analyzer);
q.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
           
Query query = q.parse(search);
Hits hits = isearcher.search(query);
 

Best,
Martin





erlend.bjorge@aftenposten.no wrote:

>Hi there!
>
>I'm fairly new to Lucene and trying evaluate it. I want to retrieve only documenets (companies) that has 
>all the words searched for in different fields, crossed in different fields with AND search.
>
>I have a structure in the "documents" (companies) like this :
>
>companyName string
>keywords1 string        (Extra boosting if match here)
>keywords2 string
>
>---
>
>So far so good when I create the index with boosting on keywords1 etc.
>
>If I search for:
>
>"ford garage"
>
>I want companies that rapair ford cars, and only those.
>
>So if you have documents/companies like:
>
>"company1", "ford volvo", "garage motor ...."
>"company2", "volvo nissan", "garage ......."
>"company3", null, "ford garage motor ....."
>
>---
>
>I will have match with company1 and company3. Company2 don't do ford cars. Company3 haven't match in keywords1, but 
>they still doing garage stuff with Ford cars, but should be after company1 in the result list, since match in keywords1 
>should give you extra boosting = higher score = why I don't have only one field for both keywords.
>
>So I have tried different approaches doing searching, something like this:
>
>1 ---------------------------------------------------------------------------
>
>        String[] fieldNames = new String[]{"companyName", "keywords1", "keywords2"};
>        String searchText = "ford garage" / "ford AND garage"
>
>        BooleanQuery query = new BooleanQuery();
>
>        for(int i = 0; i < fieldNames.length; ++i) {
>            QueryParser qp = new QueryParser(fieldNames[i], new StandardAnalyzer());
>
>            qp.setOperator(QueryParser.DEFAULT_OPERATOR_AND);
>            query.add(qp.parse(searchText), false, false);
>        }
>
>        Hits h = is.search(query);
>
>2 ---------------------------------------------------------------------------
>
>        Query query = MultiFieldQueryParser.parse(searchText, fieldNames, new StandardAnalyzer());
>
>        Hits h = is.search(query);
>
>------------------------------------------------------------------------------
>
>In both cases, I only get company3 in the result because "ford garage" is within the same field for this company I guess, 
>and I don't get Company1 too. I guess because the two words are split in two different fields = search in muliti fields, 
>cross field AND..... if you understand ?
>
>So is this possible with Lucene, what have I missed ? I have the Lucene book, can't find anything there, tried to find
>anyting in the archive, didn't find anythign there either, but maybe I didn't look good enough ? What should I look for ?
>
>So any clues of how I can do this search ??
>
>
>Thanks in advanced!
>
>
>Cheers, 
>Erlend 
>
>  
>



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org