You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Eshwaramoorthy Babu <ba...@gmail.com> on 2006/12/04 09:45:17 UTC

Multiple character wildcard search

Hi,



Can anyone please tell me how to specify multiple character wildcard
searches in "Term"

Below is my requirement



1) I want search all names that starts with Z (Z*)

2) My programme will receive list of names in JAVA collection (Vector or
ArrayList or Hashtable), I want to search for all the names which are not
there in the collection



I have tried the below code to implement the 1st problem but the search
returns 0 result



Analyzer analyzer = new WhitespaceAnalyzer();

boolean createFlag = true;

.......

.......

.......

IndexSearcher searcher = new IndexSearcher(indexDir1);

Query query = new TermQuery(new Term("name", "Z*"));



When I execute teh above code the search is returning 0 Hits.

If I give the full name "Zane" the search returns 1 Hits.





Thanks in advance,

Babu

Re: Dreaded optimize (again!)

Posted by Michael McCandless <lu...@mikemccandless.com>.
Stanislav Jordanov wrote:

> How much free disk space should be there (with respect to the index 
> size) in order for the optimize to complete successfully?

Good question!

Really this detail should be included in the Javadoc for optimize (and
more generally addDocument, addIndexes(*), etc.).  I will update the
Javadocs of these methods once we work out the answer here.

Optimize actually does a series of IndexWriter.mergeSegments (private
method) calls.  Each call merges the last mergeFactor (default 10)
segments into a single segment, and repeats this until there's 1
segment.  So this question reduces to peak temp disk usage of
mergeSegments.

That call builds up the new segment by reading all data from each of
the input segments and writing into a new single segment.  Only once
the new segment is fully written to disk, does it then "commit",
meaning it writes a new "segments" or "segments_N" (trunk) file and
then removes the input segments.  So max temp disk usage of this step
is:

     (net size of un-merged segments) +
        (net size of original segments to be merged) +
        (net size of new segment files)

Then if CFS is enabled it makes a CFS file for this new segment.  Max
temp usage of this step is:

     (net size of un-merged segments) +
       2 * (net size of new segment files)

The relative segments sizes are not simple to compute, but I think if
you have no deletes and no separate norms, then the new segment will
generally be a bit smaller than the sum of the input segments (but
this is likely document dependent).

So, with optimize() it's the final call to mergeSegments that will
peak the temp disk usage, because that is the largest merge.  So, I
think the overall answer is: you will need a little more (say 10%)
than 2X your final index size of free space to run optimize.

And expressed instead in terms of size of the index before starting
optimize the answer is: you need 2X the size of the input index in
free disk space, to run optimize.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Dreaded optimize (again!)

Posted by Stanislav Jordanov <st...@sirma.bg>.
Guys,

there's another aspect of the index optimize operation, that confuses us 
a lot - the free disk space it requires to complete successfully.
Initially we thought that an amount of free disk space equal to the 
index size (prior to optimization) should suffice.
Then it became clear that having the Index opened for reading (i.e. 
searching) while optimizing it prevents old segments from being deletion.
So let us assume that nobody else is touching the index except the 
optimize method itself.
How much free disk space should be there (with respect to the index 
size) in order for the optimize to complete successfully?

Stanislav

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Multiple character wildcard search

Posted by Bhavin Pandya <bh...@rediff.co.in>.
Dont use " * " in term....

Query query = new PrefixQuery(new Term("name","z"));

- Bhavin pandya

  ----- Original Message ----- 
  From: Eshwaramoorthy Babu 
  To: java-user@lucene.apache.org ; Bhavin Pandya 
  Sent: Monday, December 04, 2006 3:16 PM
  Subject: Re: Multiple character wildcard search


  Hi Bhavin,

  Thanks for your response. I tried the below   Query query = new PrefixQuery(new Term("name", "Z*"));
   
  but it still the query returns 0 result.

  Also can you please tell me how to search form  JAVA collection?

  Thanks,
  Babu

   
  On 12/4/06, Bhavin Pandya <bh...@rediff.co.in> wrote: 
    Babu,

    Use "PrefixQuery" and if you are looking for phrase also then
    "PhrasePrefixQuery"... 
    check api for usage....

    - Bhavin pandya


    ----- Original Message -----
    From: "Eshwaramoorthy Babu" <ba...@gmail.com>
    To: < java-user@lucene.apache.org>
    Sent: Monday, December 04, 2006 2:15 PM
    Subject: Multiple character wildcard search


    > Hi,
    >
    >
    >
    > Can anyone please tell me how to specify multiple character wildcard 
    > searches in "Term"
    >
    > Below is my requirement
    >
    >
    >
    > 1) I want search all names that starts with Z (Z*)
    >
    > 2) My programme will receive list of names in JAVA collection (Vector or 
    > ArrayList or Hashtable), I want to search for all the names which are not
    > there in the collection
    >
    >
    >
    > I have tried the below code to implement the 1st problem but the search
    > returns 0 result
    >
    >
    >
    > Analyzer analyzer = new WhitespaceAnalyzer();
    >
    > boolean createFlag = true;
    >
    > .......
    >
    > .......
    >
    > .......
    > 
    > IndexSearcher searcher = new IndexSearcher(indexDir1);
    >
    > Query query = new TermQuery(new Term("name", "Z*"));
    >
    >
    >
    > When I execute teh above code the search is returning 0 Hits. 
    >
    > If I give the full name "Zane" the search returns 1 Hits.
    >
    >
    >
    >
    >
    > Thanks in advance,
    >
    > Babu
    >


    --------------------------------------------------------------------- 
    To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
    For additional commands, e-mail: java-user-help@lucene.apache.org 




Re: Multiple character wildcard search

Posted by Eshwaramoorthy Babu <ba...@gmail.com>.
HI,

Do I have to use any specific analyser to use PrefixQuery.
I am using WhitespaceAnalyzer and below is how I am populating the fileds
when dding to writer.

Document contactDocument  = new Document();
  contactDocument.add(new Field("type",contact.getType(),Field.Store.NO,
Field.Index.TOKENIZED));
  writer1.addDocument(contactDocument);

Thanks,
Babu


On 12/4/06, Eshwaramoorthy Babu <ba...@gmail.com> wrote:
>
> Hi Bhavin,
>
> Thanks for your response. I tried the below   Query query = new
> PrefixQuery(new Term("name", "Z*"));
>
> but it still the query returns 0 result.
>
> Also can you please tell me how to search form  JAVA collection?
>
> Thanks,
> Babu
>
>
>  On 12/4/06, Bhavin Pandya <bh...@rediff.co.in> wrote:
> >
> > Babu,
> >
> > Use "PrefixQuery" and if you are looking for phrase also then
> > "PhrasePrefixQuery"...
> > check api for usage....
> >
> > - Bhavin pandya
> >
> >
> > ----- Original Message -----
> > From: "Eshwaramoorthy Babu" <ba...@gmail.com>
> > To: < java-user@lucene.apache.org>
> > Sent: Monday, December 04, 2006 2:15 PM
> > Subject: Multiple character wildcard search
> >
> >
> > > Hi,
> > >
> > >
> > >
> > > Can anyone please tell me how to specify multiple character wildcard
> > > searches in "Term"
> > >
> > > Below is my requirement
> > >
> > >
> > >
> > > 1) I want search all names that starts with Z (Z*)
> > >
> > > 2) My programme will receive list of names in JAVA collection (Vector
> > or
> > > ArrayList or Hashtable), I want to search for all the names which are
> > not
> > > there in the collection
> > >
> > >
> > >
> > > I have tried the below code to implement the 1st problem but the
> > search
> > > returns 0 result
> > >
> > >
> > >
> > > Analyzer analyzer = new WhitespaceAnalyzer();
> > >
> > > boolean createFlag = true;
> > >
> > > .......
> > >
> > > .......
> > >
> > > .......
> > >
> > > IndexSearcher searcher = new IndexSearcher(indexDir1);
> > >
> > > Query query = new TermQuery(new Term("name", "Z*"));
> > >
> > >
> > >
> > > When I execute teh above code the search is returning 0 Hits.
> > >
> > > If I give the full name "Zane" the search returns 1 Hits.
> > >
> > >
> > >
> > >
> > >
> > > Thanks in advance,
> > >
> > > Babu
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>

Re: Multiple character wildcard search

Posted by Eshwaramoorthy Babu <ba...@gmail.com>.
Hi Bhavin,

Thanks for your response. I tried the below   Query query = new
PrefixQuery(new Term("name", "Z*"));

but it still the query returns 0 result.

Also can you please tell me how to search form  JAVA collection?

Thanks,
Babu


On 12/4/06, Bhavin Pandya <bh...@rediff.co.in> wrote:
>
> Babu,
>
> Use "PrefixQuery" and if you are looking for phrase also then
> "PhrasePrefixQuery"...
> check api for usage....
>
> - Bhavin pandya
>
>
> ----- Original Message -----
> From: "Eshwaramoorthy Babu" <ba...@gmail.com>
> To: <ja...@lucene.apache.org>
> Sent: Monday, December 04, 2006 2:15 PM
> Subject: Multiple character wildcard search
>
>
> > Hi,
> >
> >
> >
> > Can anyone please tell me how to specify multiple character wildcard
> > searches in "Term"
> >
> > Below is my requirement
> >
> >
> >
> > 1) I want search all names that starts with Z (Z*)
> >
> > 2) My programme will receive list of names in JAVA collection (Vector or
> > ArrayList or Hashtable), I want to search for all the names which are
> not
> > there in the collection
> >
> >
> >
> > I have tried the below code to implement the 1st problem but the search
> > returns 0 result
> >
> >
> >
> > Analyzer analyzer = new WhitespaceAnalyzer();
> >
> > boolean createFlag = true;
> >
> > .......
> >
> > .......
> >
> > .......
> >
> > IndexSearcher searcher = new IndexSearcher(indexDir1);
> >
> > Query query = new TermQuery(new Term("name", "Z*"));
> >
> >
> >
> > When I execute teh above code the search is returning 0 Hits.
> >
> > If I give the full name "Zane" the search returns 1 Hits.
> >
> >
> >
> >
> >
> > Thanks in advance,
> >
> > Babu
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Multiple character wildcard search

Posted by Bhavin Pandya <bh...@rediff.co.in>.
Babu,

Use "PrefixQuery" and if you are looking for phrase also then 
"PhrasePrefixQuery"...
check api for usage....

- Bhavin pandya


----- Original Message ----- 
From: "Eshwaramoorthy Babu" <ba...@gmail.com>
To: <ja...@lucene.apache.org>
Sent: Monday, December 04, 2006 2:15 PM
Subject: Multiple character wildcard search


> Hi,
>
>
>
> Can anyone please tell me how to specify multiple character wildcard
> searches in "Term"
>
> Below is my requirement
>
>
>
> 1) I want search all names that starts with Z (Z*)
>
> 2) My programme will receive list of names in JAVA collection (Vector or
> ArrayList or Hashtable), I want to search for all the names which are not
> there in the collection
>
>
>
> I have tried the below code to implement the 1st problem but the search
> returns 0 result
>
>
>
> Analyzer analyzer = new WhitespaceAnalyzer();
>
> boolean createFlag = true;
>
> .......
>
> .......
>
> .......
>
> IndexSearcher searcher = new IndexSearcher(indexDir1);
>
> Query query = new TermQuery(new Term("name", "Z*"));
>
>
>
> When I execute teh above code the search is returning 0 Hits.
>
> If I give the full name "Zane" the search returns 1 Hits.
>
>
>
>
>
> Thanks in advance,
>
> Babu
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org