You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by "ZYWALEWSKI, DANIEL (DANIEL)" <da...@alcatel-lucent.com> on 2011/02/16 11:39:16 UTC

query with long names

Hello,
 I have a problem with documents that much the same query. So I do not index anything what can identify clearly my documents (like id). That's why I want add a document that is already indexed I don't add. And If I want delete a document and more documents match my query I don't delete any of them. The problem is that the only difference between them is a name. So it looks like this:
1) I want to index "Crazy Network"
2) I create a document Lucene with Field "name" and value Crazy Network"
3) I use a Query Parser with Standard Analyzer to see If I haven't already indexed it:
    - so I use StringBuffer to add the quotes before and after the name I'm looking or -> So query is "Crazy Network" in nameField.
 4) If there is no match I index, If not I do not index

So if the first name I indexed is "Private Network Really" I cannot index after it "Private Network" (because Private Network Really will match the query, and for me it means that this document is already indexed).

Is there any way to format the query to indentify clearly the name I'm looking for? So if I want find "Private Network" I won't also find "Private Network Really"?
Thanks
D

Re: query with long names

Posted by Erick Erickson <er...@gmail.com>.

Sure, just use a field that is not analyzed. Perhaps you want to
define a new field in your documents like "nameKey" that is
analyzed with something like KeywordAnalyzer. See:
http://lucene.apache.org/java/3_0_3/api/all/index.html

PerFieldAnalyzerWrapper will let you use different
analyzers for different fields.

Best
Erick

On Wed, Feb 16, 2011 at 5:39 AM, ZYWALEWSKI, DANIEL (DANIEL)
<da...@alcatel-lucent.com> wrote:
> Hello,
>  I have a problem with documents that much the same query. So I do not index anything what can identify clearly my documents (like id). That's why I want add a document that is already indexed I don't add. And If I want delete a document and more documents match my query I don't delete any of them. The problem is that the only difference between them is a name. So it looks like this:
> 1) I want to index "Crazy Network"
> 2) I create a document Lucene with Field "name" and value Crazy Network"
> 3) I use a Query Parser with Standard Analyzer to see If I haven't already indexed it:
>    - so I use StringBuffer to add the quotes before and after the name I'm looking or -> So query is "Crazy Network" in nameField.
>  4) If there is no match I index, If not I do not index
>
> So if the first name I indexed is "Private Network Really" I cannot index after it "Private Network" (because Private Network Really will match the query, and for me it means that this document is already indexed).
>
> Is there any way to format the query to indentify clearly the name I'm looking for? So if I want find "Private Network" I won't also find "Private Network Really"?
> Thanks
> D
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org