You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Eshwaramoorthy Babu <ba...@gmail.com> on 2006/12/04 09:45:17 UTC
Multiple character wildcard search
Hi,
Can anyone please tell me how to specify multiple character wildcard
searches in "Term"
Below is my requirement
1) I want search all names that starts with Z (Z*)
2) My programme will receive list of names in JAVA collection (Vector or
ArrayList or Hashtable), I want to search for all the names which are not
there in the collection
I have tried the below code to implement the 1st problem but the search
returns 0 result
Analyzer analyzer = new WhitespaceAnalyzer();
boolean createFlag = true;
.......
.......
.......
IndexSearcher searcher = new IndexSearcher(indexDir1);
Query query = new TermQuery(new Term("name", "Z*"));
When I execute teh above code the search is returning 0 Hits.
If I give the full name "Zane" the search returns 1 Hits.
Thanks in advance,
Babu
Re: Dreaded optimize (again!)
Posted by Michael McCandless <lu...@mikemccandless.com>.
Stanislav Jordanov wrote:
> How much free disk space should be there (with respect to the index
> size) in order for the optimize to complete successfully?
Good question!
Really this detail should be included in the Javadoc for optimize (and
more generally addDocument, addIndexes(*), etc.). I will update the
Javadocs of these methods once we work out the answer here.
Optimize actually does a series of IndexWriter.mergeSegments (private
method) calls. Each call merges the last mergeFactor (default 10)
segments into a single segment, and repeats this until there's 1
segment. So this question reduces to peak temp disk usage of
mergeSegments.
That call builds up the new segment by reading all data from each of
the input segments and writing into a new single segment. Only once
the new segment is fully written to disk, does it then "commit",
meaning it writes a new "segments" or "segments_N" (trunk) file and
then removes the input segments. So max temp disk usage of this step
is:
(net size of un-merged segments) +
(net size of original segments to be merged) +
(net size of new segment files)
Then if CFS is enabled it makes a CFS file for this new segment. Max
temp usage of this step is:
(net size of un-merged segments) +
2 * (net size of new segment files)
The relative segments sizes are not simple to compute, but I think if
you have no deletes and no separate norms, then the new segment will
generally be a bit smaller than the sum of the input segments (but
this is likely document dependent).
So, with optimize() it's the final call to mergeSegments that will
peak the temp disk usage, because that is the largest merge. So, I
think the overall answer is: you will need a little more (say 10%)
than 2X your final index size of free space to run optimize.
And expressed instead in terms of size of the index before starting
optimize the answer is: you need 2X the size of the input index in
free disk space, to run optimize.
Mike
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Dreaded optimize (again!)
Posted by Stanislav Jordanov <st...@sirma.bg>.
Guys,
there's another aspect of the index optimize operation, that confuses us
a lot - the free disk space it requires to complete successfully.
Initially we thought that an amount of free disk space equal to the
index size (prior to optimization) should suffice.
Then it became clear that having the Index opened for reading (i.e.
searching) while optimizing it prevents old segments from being deletion.
So let us assume that nobody else is touching the index except the
optimize method itself.
How much free disk space should be there (with respect to the index
size) in order for the optimize to complete successfully?
Stanislav
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Multiple character wildcard search
Posted by Bhavin Pandya <bh...@rediff.co.in>.
Dont use " * " in term....
Query query = new PrefixQuery(new Term("name","z"));
- Bhavin pandya
----- Original Message -----
From: Eshwaramoorthy Babu
To: java-user@lucene.apache.org ; Bhavin Pandya
Sent: Monday, December 04, 2006 3:16 PM
Subject: Re: Multiple character wildcard search
Hi Bhavin,
Thanks for your response. I tried the below Query query = new PrefixQuery(new Term("name", "Z*"));
but it still the query returns 0 result.
Also can you please tell me how to search form JAVA collection?
Thanks,
Babu
On 12/4/06, Bhavin Pandya <bh...@rediff.co.in> wrote:
Babu,
Use "PrefixQuery" and if you are looking for phrase also then
"PhrasePrefixQuery"...
check api for usage....
- Bhavin pandya
----- Original Message -----
From: "Eshwaramoorthy Babu" <ba...@gmail.com>
To: < java-user@lucene.apache.org>
Sent: Monday, December 04, 2006 2:15 PM
Subject: Multiple character wildcard search
> Hi,
>
>
>
> Can anyone please tell me how to specify multiple character wildcard
> searches in "Term"
>
> Below is my requirement
>
>
>
> 1) I want search all names that starts with Z (Z*)
>
> 2) My programme will receive list of names in JAVA collection (Vector or
> ArrayList or Hashtable), I want to search for all the names which are not
> there in the collection
>
>
>
> I have tried the below code to implement the 1st problem but the search
> returns 0 result
>
>
>
> Analyzer analyzer = new WhitespaceAnalyzer();
>
> boolean createFlag = true;
>
> .......
>
> .......
>
> .......
>
> IndexSearcher searcher = new IndexSearcher(indexDir1);
>
> Query query = new TermQuery(new Term("name", "Z*"));
>
>
>
> When I execute teh above code the search is returning 0 Hits.
>
> If I give the full name "Zane" the search returns 1 Hits.
>
>
>
>
>
> Thanks in advance,
>
> Babu
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Multiple character wildcard search
Posted by Eshwaramoorthy Babu <ba...@gmail.com>.
HI,
Do I have to use any specific analyser to use PrefixQuery.
I am using WhitespaceAnalyzer and below is how I am populating the fileds
when dding to writer.
Document contactDocument = new Document();
contactDocument.add(new Field("type",contact.getType(),Field.Store.NO,
Field.Index.TOKENIZED));
writer1.addDocument(contactDocument);
Thanks,
Babu
On 12/4/06, Eshwaramoorthy Babu <ba...@gmail.com> wrote:
>
> Hi Bhavin,
>
> Thanks for your response. I tried the below Query query = new
> PrefixQuery(new Term("name", "Z*"));
>
> but it still the query returns 0 result.
>
> Also can you please tell me how to search form JAVA collection?
>
> Thanks,
> Babu
>
>
> On 12/4/06, Bhavin Pandya <bh...@rediff.co.in> wrote:
> >
> > Babu,
> >
> > Use "PrefixQuery" and if you are looking for phrase also then
> > "PhrasePrefixQuery"...
> > check api for usage....
> >
> > - Bhavin pandya
> >
> >
> > ----- Original Message -----
> > From: "Eshwaramoorthy Babu" <ba...@gmail.com>
> > To: < java-user@lucene.apache.org>
> > Sent: Monday, December 04, 2006 2:15 PM
> > Subject: Multiple character wildcard search
> >
> >
> > > Hi,
> > >
> > >
> > >
> > > Can anyone please tell me how to specify multiple character wildcard
> > > searches in "Term"
> > >
> > > Below is my requirement
> > >
> > >
> > >
> > > 1) I want search all names that starts with Z (Z*)
> > >
> > > 2) My programme will receive list of names in JAVA collection (Vector
> > or
> > > ArrayList or Hashtable), I want to search for all the names which are
> > not
> > > there in the collection
> > >
> > >
> > >
> > > I have tried the below code to implement the 1st problem but the
> > search
> > > returns 0 result
> > >
> > >
> > >
> > > Analyzer analyzer = new WhitespaceAnalyzer();
> > >
> > > boolean createFlag = true;
> > >
> > > .......
> > >
> > > .......
> > >
> > > .......
> > >
> > > IndexSearcher searcher = new IndexSearcher(indexDir1);
> > >
> > > Query query = new TermQuery(new Term("name", "Z*"));
> > >
> > >
> > >
> > > When I execute teh above code the search is returning 0 Hits.
> > >
> > > If I give the full name "Zane" the search returns 1 Hits.
> > >
> > >
> > >
> > >
> > >
> > > Thanks in advance,
> > >
> > > Babu
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
Re: Multiple character wildcard search
Posted by Eshwaramoorthy Babu <ba...@gmail.com>.
Hi Bhavin,
Thanks for your response. I tried the below Query query = new
PrefixQuery(new Term("name", "Z*"));
but it still the query returns 0 result.
Also can you please tell me how to search form JAVA collection?
Thanks,
Babu
On 12/4/06, Bhavin Pandya <bh...@rediff.co.in> wrote:
>
> Babu,
>
> Use "PrefixQuery" and if you are looking for phrase also then
> "PhrasePrefixQuery"...
> check api for usage....
>
> - Bhavin pandya
>
>
> ----- Original Message -----
> From: "Eshwaramoorthy Babu" <ba...@gmail.com>
> To: <ja...@lucene.apache.org>
> Sent: Monday, December 04, 2006 2:15 PM
> Subject: Multiple character wildcard search
>
>
> > Hi,
> >
> >
> >
> > Can anyone please tell me how to specify multiple character wildcard
> > searches in "Term"
> >
> > Below is my requirement
> >
> >
> >
> > 1) I want search all names that starts with Z (Z*)
> >
> > 2) My programme will receive list of names in JAVA collection (Vector or
> > ArrayList or Hashtable), I want to search for all the names which are
> not
> > there in the collection
> >
> >
> >
> > I have tried the below code to implement the 1st problem but the search
> > returns 0 result
> >
> >
> >
> > Analyzer analyzer = new WhitespaceAnalyzer();
> >
> > boolean createFlag = true;
> >
> > .......
> >
> > .......
> >
> > .......
> >
> > IndexSearcher searcher = new IndexSearcher(indexDir1);
> >
> > Query query = new TermQuery(new Term("name", "Z*"));
> >
> >
> >
> > When I execute teh above code the search is returning 0 Hits.
> >
> > If I give the full name "Zane" the search returns 1 Hits.
> >
> >
> >
> >
> >
> > Thanks in advance,
> >
> > Babu
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
Re: Multiple character wildcard search
Posted by Bhavin Pandya <bh...@rediff.co.in>.
Babu,
Use "PrefixQuery" and if you are looking for phrase also then
"PhrasePrefixQuery"...
check api for usage....
- Bhavin pandya
----- Original Message -----
From: "Eshwaramoorthy Babu" <ba...@gmail.com>
To: <ja...@lucene.apache.org>
Sent: Monday, December 04, 2006 2:15 PM
Subject: Multiple character wildcard search
> Hi,
>
>
>
> Can anyone please tell me how to specify multiple character wildcard
> searches in "Term"
>
> Below is my requirement
>
>
>
> 1) I want search all names that starts with Z (Z*)
>
> 2) My programme will receive list of names in JAVA collection (Vector or
> ArrayList or Hashtable), I want to search for all the names which are not
> there in the collection
>
>
>
> I have tried the below code to implement the 1st problem but the search
> returns 0 result
>
>
>
> Analyzer analyzer = new WhitespaceAnalyzer();
>
> boolean createFlag = true;
>
> .......
>
> .......
>
> .......
>
> IndexSearcher searcher = new IndexSearcher(indexDir1);
>
> Query query = new TermQuery(new Term("name", "Z*"));
>
>
>
> When I execute teh above code the search is returning 0 Hits.
>
> If I give the full name "Zane" the search returns 1 Hits.
>
>
>
>
>
> Thanks in advance,
>
> Babu
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org