You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Oliver Castle <Ol...@TescoDiets.com> on 2007/10/22 11:18:24 UTC
Multi field search problems
Hello
Firstly great work on every one involved in the Lucene.NET its a great
project. We are going to use it as the search for a new project that we
are working on which is due to go live in the next few months. The only
problem is that we are getting great results from the search if we are
only searching one field but generally our users will be selecting from
two or three fields which using the MultiFieldQueryParser does not seem
to be producing great results so I was wondering if any one might be
able to help us?
To create the index I am using the code below which is written to disk:-
luceneDocument.Add(new
Field(Constant.LuceneConstant_FoodItemId,
foodItem.FoodItemId.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_ParentFoodItemId,
foodItem.ParentFoodItemId.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_DataProviderItemId,
foodItem.DataProviderItemId.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_EANcode,
(foodItem.EuropeanArticleNumberCode == null) ? string.Empty :
foodItem.EuropeanArticleNumberCode.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_Summary, foodItem.Summary,
Field.Store.YES, Field.Index.TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_StorageTypeId,
foodItem.StorageTypeId.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_FoodItemSupplierId,
foodItem.FoodItemSupplierId.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_FoodItemBrandId,
foodItem.FoodItemBrandId.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_MeasurementType,
((int)foodItem.MeasurementType).ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_NoOfUnits, foodItem.NoOfUnits.ToString(),
Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_PackSize, foodItem.PackSize.ToString(),
Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_PortionSize,
foodItem.PortionSize.ToString(), Field.Store.YES,
Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_Approved, foodItem.Approved.ToString(),
Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_Created, foodItem.Created.ToString(),
Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO));
luceneDocument.Add(new
Field(Constant.LuceneConstant_Updated, foodItem.Updated.ToString(),
Field.Store.YES, Field.Index.UN_TOKENIZED, Field.TermVector.NO));
The index contains about 20,000 food products which creates an index of
about 6mb. The code to search this index is then below
public static SortableResultsCollection<FoodItem>
FoodItemSearch(short departmentCode, short ailseCode, short shelfCode,
short supermarketId, byte searchType, string searchQuery, int
startValue, int amountOfResults)
{
List<string> queryFieldList = new List<string>();
List<string> queryList = new List<string>();
List<BooleanClause.Occur> queryClauseList = new
List<BooleanClause.Occur>();
//Process the department code
if (departmentCode > 0)
{
queryList.Add(Constant.LuceneConstant_CategoryTopLevel);
queryFieldList.Add(departmentCode.ToString());
queryClauseList.Add(BooleanClause.Occur.MUST);
}
//Process the ailse code
if (ailseCode > 0)
{
queryList.Add(Constant.LuceneConstant_CategorySecondLevel);
queryFieldList.Add(ailseCode.ToString());
queryClauseList.Add(BooleanClause.Occur.MUST);
}
//Process the shelf code
if (shelfCode > 0)
{
queryList.Add(Constant.LuceneConstant_CategoryThirdLevel);
queryFieldList.Add(shelfCode.ToString());
queryClauseList.Add(BooleanClause.Occur.MUST);
}
//Process the supermarket
if (supermarketId > 0)
{
queryList.Add(Constant.LuceneConstant_FoodItemSupplierId);
queryFieldList.Add(supermarketId.ToString());
queryClauseList.Add(BooleanClause.Occur.MUST);
}
//Process the search query
if (searchQuery != string.Empty)
{
if (searchType == 1)
{
queryList.Add(Constant.LuceneConstant_Summary);
queryFieldList.Add(searchQuery);
queryClauseList.Add(BooleanClause.Occur.MUST);
}
else
{
queryList.Add(Constant.LuceneConstant_EANcode);
queryFieldList.Add(searchQuery);
queryClauseList.Add(BooleanClause.Occur.MUST);
}
}
//Create the arrays to pass to the query
string[] queryFieldArray = new string[queryFieldList.Count];
string[] queryArray = new string[queryList.Count];
BooleanClause.Occur[] occurArray = new
BooleanClause.Occur[queryList.Count];
//Assign the list data to the array
int rowCount = queryList.Count - 1;
for (int i = 0; i <= rowCount; i++)
{
queryFieldArray[i] = queryFieldList[i];
queryArray[i] = queryList[i];
occurArray[i] = queryClauseList[i];
}
Query query = MultiFieldQueryParser.Parse(queryFieldArray,
queryArray, occurArray, new StandardAnalyzer());
EGC.SortableResultsCollection<FoodItem> collection = new
EGC.SortableResultsCollection<FoodItem>();
string indexPath =
ConfigurationManager.AppSettings[CONST_LUCENE_PATH];
IndexSearcher indexSearcher = new IndexSearcher(indexPath);
Hits searchHits = indexSearcher.Search(query);
int totalHitsCount = searchHits.Length();
collection.TotalResultCount = totalHitsCount - 1;
collection.StartResult = startValue;
int hitsCount = startValue + amountOfResults;
if (hitsCount > totalHitsCount)
{
hitsCount = amountOfResults;
}
if (totalHitsCount > 0)
{
for (int hitNumer = startValue; hitNumer <= hitsCount;
hitNumer++)
{
collection.Add(LuceneDocumentToFoodItem(searchHits.Doc(hitNumer)));
}
}
//close the index sercher
indexSearcher.Close();
I know the code at the moment is not amazing but we are trying to get
the concept correct before we do any tuning. I have tried changing
BooleanClause.Occur.MUST to BooleanClause.Occur.SHOULD but we have seen
no improvement in the search. Is using the MultiFieldQueryParser the
correct way to search for products in the index or do I need to do
singular searches and then apply filters to narrow the results?
Any help is greatly received.
Thanks
Ollie Castle
***************************************************************************
E-mail Disclaimer
The information transmitted in this e-mail is intended only for the confidential use of the named recipient and may contain confidential and/or privileged material.
Any review, retransmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited.
Furthermore, no part of the information may be reproduced or transmitted in any form or by any means, electronic or mechanical, or by an information storage or retrieval system, without prior permission.
TescoDiets may monitor email traffic data and also the content of email for the purposes of security and staff training.
If you received this in error, please contact the sender and/or system manager and delete the material from all relevant computers.
The employer assumes no responsibility for any use to which the information may be put, or for any errors.
Registration Number 19542
***************************************************************************