You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Laxmilal Menaria <lm...@chambal.com> on 2007/05/30 09:00:36 UTC

Get all unique values of specific field

Hello everyone,

I have created a Lucene Index of Students Database, this database have 5
fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
Now I have opened Searcher and query "Name:Menaria" , Its return 100
results. Now I wants the All unique "Class"  names which is return in Hits
objects, How can I get unique Class list without using Loop.

Please suggest me..

-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

Re: Get all unique values of specific field

Posted by Jokin Cuadrado <jo...@gmail.com>.
You can also get stringindex of the class field, for access to the
field info without the need of accessing the document in hits.

You can implement a custom collector and get the unique values in that.

On 5/30/07, karl wettin <ka...@gmail.com> wrote:
>
> 30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
> >> > What's the problem with a hit list iteration ( it should be
> >> > very fast  )  ?
> >

> > Thats okay for short index, But if index have millions of records
> > or GB's
> > data then it will get slow .
>
> Iterate only the top n results when you gather the unique values. If
> you get a million hits, ask the user to narrow down the search a bit.
>
> Searching the forum archives for facets or faceted classification
> might also be helpful.
>
>
>
> --
> karl

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Get all unique values of specific field

Posted by Laxmilal Menaria <lm...@chambal.com>.
On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
>
> Thanks karl,
>
> But if I implement faceted classification, then I know whats our classes
> name, but if I don't know classes name, then what should I do ?
>
>
> On 5/30/07, karl wettin <ka...@gmail.com> wrote:
> >
> >
> > 30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
> > >> > What's the problem with a hit list iteration ( it should be
> > >> > very fast  )  ?
> > >
> > > Thats okay for short index, But if index have millions of records
> > > or GB's
> > > data then it will get slow .
> >
> > Iterate only the top n results when you gather the unique values. If
> > you get a million hits, ask the user to narrow down the search a bit.
> >
> > Searching the forum archives for facets or faceted classification
> > might also be helpful.
> >
> >
> >
> > --
> > karl
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
> --
> Thanks,
> Laxmilal menaria
>
> http://www.minalyzer.com/
> http://www.chambal.com/
>



-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

Re: Get all unique values of specific field

Posted by Erick Erickson <er...@gmail.com>.
You could also think about pre-building a set of Filters once your index has
been built, perhaps at warm-up time. One filter for each class and use those
in your queries....

Erick

On 5/30/07, karl wettin <ka...@gmail.com> wrote:
>
>
> 30 maj 2007 kl. 12.21 skrev Laxmilal Menaria:
>
> > On 5/30/07, karl wettin <ka...@gmail.com> wrote:
> >> 30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
> >> >> > What's the problem with a hit list iteration ( it should be
> >> >> > very fast  )  ?
> >> >
> >> > Thats okay for short index, But if index have millions of records
> >> > or GB's data then it will get slow .
> >>
> >> Iterate only the top n results when you gather the unique values. If
> >> you get a million hits, ask the user to narrow down the search a bit.
> >>
> >> Searching the forum archives for facets or faceted classification
> >> might also be helpful.
> >
> > But if I implement faceted classification, then I know whats our
> > classes
> > name, but if I don't know classes name, then what should I do ?
>
> Then my recommend my first suggestion, that you gather the information
> by iterating the top n results.
>
> But you know the classes. There is a finite number of them and at some
> point in time you have inserted them to your index. Depending on how
> you store and index documents you might even be able to extract them
> from the index.
>
> For instance, you could index the class name untokenized in field used
> only for that purpose. The terms in this field now represents the
> available classes. You access them by seeking and iterating terms using
> the IndexReader.
>
>
> --
> karl
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Get all unique values of specific field

Posted by karl wettin <ka...@gmail.com>.
30 maj 2007 kl. 12.21 skrev Laxmilal Menaria:

> On 5/30/07, karl wettin <ka...@gmail.com> wrote:
>> 30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
>> >> > What's the problem with a hit list iteration ( it should be
>> >> > very fast  )  ?
>> >
>> > Thats okay for short index, But if index have millions of records
>> > or GB's data then it will get slow .
>>
>> Iterate only the top n results when you gather the unique values. If
>> you get a million hits, ask the user to narrow down the search a bit.
>>
>> Searching the forum archives for facets or faceted classification
>> might also be helpful.
>
> But if I implement faceted classification, then I know whats our  
> classes
> name, but if I don't know classes name, then what should I do ?

Then my recommend my first suggestion, that you gather the information
by iterating the top n results.

But you know the classes. There is a finite number of them and at some
point in time you have inserted them to your index. Depending on how
you store and index documents you might even be able to extract them
from the index.

For instance, you could index the class name untokenized in field used
only for that purpose. The terms in this field now represents the
available classes. You access them by seeking and iterating terms using
the IndexReader.


-- 
karl


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Get all unique values of specific field

Posted by Laxmilal Menaria <lm...@chambal.com>.
On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
>
> Thanks karl,
>
> But if I implement faceted classification, then I know whats our classes
> name, but if I don't know classes name, then what should I do ?
>
>
> On 5/30/07, karl wettin <ka...@gmail.com> wrote:
> >
> >
> > 30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
> > >> > What's the problem with a hit list iteration ( it should be
> > >> > very fast  )  ?
> > >
> > > Thats okay for short index, But if index have millions of records
> > > or GB's
> > > data then it will get slow .
> >
> > Iterate only the top n results when you gather the unique values. If
> > you get a million hits, ask the user to narrow down the search a bit.
> >
> > Searching the forum archives for facets or faceted classification
> > might also be helpful.
> >
> >
> >
> > --
> > karl
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>
>
> --
> Thanks,
> Laxmilal menaria
>
> http://www.minalyzer.com/
> http://www.chambal.com/
>



-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

Re: Get all unique values of specific field

Posted by Laxmilal Menaria <lm...@chambal.com>.
Thanks karl,

But if I implement faceted classification, then I know whats our classes
name, but if I don't know classes name, then what should I do ?


On 5/30/07, karl wettin <ka...@gmail.com> wrote:
>
>
> 30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
> >> > What's the problem with a hit list iteration ( it should be
> >> > very fast  )  ?
> >
> > Thats okay for short index, But if index have millions of records
> > or GB's
> > data then it will get slow .
>
> Iterate only the top n results when you gather the unique values. If
> you get a million hits, ask the user to narrow down the search a bit.
>
> Searching the forum archives for facets or faceted classification
> might also be helpful.
>
>
>
> --
> karl
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

Re: Get all unique values of specific field

Posted by karl wettin <ka...@gmail.com>.
30 maj 2007 kl. 10.51 skrev Laxmilal Menaria:
>> > What's the problem with a hit list iteration ( it should be
>> > very fast  )  ?
>
> Thats okay for short index, But if index have millions of records  
> or GB's
> data then it will get slow .

Iterate only the top n results when you gather the unique values. If  
you get a million hits, ask the user to narrow down the search a bit.

Searching the forum archives for facets or faceted classification  
might also be helpful.



-- 
karl



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Get all unique values of specific field

Posted by Digy <di...@gmail.com>.
Hi Laxmilal,

I am sorry but i don't understand what you are trying to do.

> I have created a Lucene Index of Students Database, this
> > > database have
> > > > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > > > Now I have opened Searcher and query "Name:Menaria" , Its
> > > return 100
> > > > results. Now I wants the All unique "Class"  names which is
> > > return in
> > > > Hits objects, How can I get unique Class list without using Loop
This is a typical db application where you query like 
"select distinct class from students where name = 'Menaria' " .
You can use, for ex., an "embedded database" of your choice for this
purpose.

But "indexing files" is related with "full text search" and it will probably
not provide the functionality that you want unless you write some code.
In addition, there are some databases that claim to have text search
capabilities but i don't think that they will be as good as lucene when it
comes to "full text search".





-----Original Message-----
From: Laxmilal Menaria [mailto:lmenaria@chambal.com] 
Sent: Thursday, May 31, 2007 8:02 AM
To: lucene-net-user@incubator.apache.org
Subject: Re: Get all unique values of specific field

Thanks Digy,

but if I index files i.e. xls, logs instead of database then how to find
distinct from Index ?


On 5/30/07, Digy <di...@gmail.com> wrote:
>
> Hi Laxmilal
>
> Why don't you use a database for your application. It seems like a DB
> application.
>
> DIGY
>
> -----Original Message-----
> From: Laxmilal Menaria [mailto:lmenaria@chambal.com]
> Sent: Wednesday, May 30, 2007 11:52 AM
> To: lucene-net-user@incubator.apache.org; java-user@lucene.apache.org
> Subject: Re: Get all unique values of specific field
>
> Thanks,
>
> Thats okay for short index, But if index have millions of records or GB's
> data then it will get slow . So what is better ?
>
> On 5/30/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
> >
> >
> > hi,
> >
> > I guess it doesn't work without any iteration.
> >
> > Another option coming to my mind: If the list of possible classnames
> isn't
> > too long, you could do (pseudocode)
> >
> >   Hashtable resultCount = new Hashtable();
> >   foreach( string classname in possibleClassNames )
> >   {
> >     resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
> >     resultCount[classname] = resultlist.Count;
> >   }
> >
> > cheers,
> > Erich
> >
> >
> > > -----Original Message-----
> > > From: Michael Mitiaguin [mailto:mitiaguin@gmail.com]
> > > Sent: Wednesday, May 30, 2007 9:33 AM
> > > To: lucene-net-user@incubator.apache.org
> > > Subject: Re: Get all unique values of specific field
> > >
> > > Laxmilal ,
> > >
> > > What's the problem with a hit list iteration ( it should be
> > > very fast  )  ?
> > > I am not sure about equivalent    of SQL "distinct"  in Lucene.
> > > You didn't describe whether you index ( and plus store )  all fields.
> > > Effectively you may just store a primary key ( let's say
> > > incremental int id  for the sake of example , though you
> > > table doesn't look normalised , it  doesn't really matter  in
> > > our case  )  which  will be stored not  indexed ,  the rest
> > > may be indexed as (  Name + Address + Class  ...) or as
> > > separate fields if you need to and either stored or not stored.
> > > Having applied search "Name:Menaria"  ( and having written
> > > all this I realised that you still need to iterate a hit list
> > >  but let me finish
> > > :) )  you may form a string with coma delimited IDs and then
> > >
> > > "select distinct class from mytable where where id in "(" +
> > > formedstring + ") "
> > >
> > > Surely if you store everything in Lucene index , there is no
> > > need to query database and you may use collections for
> > > picking up distinct "Class" values , but  my understanding
> > > you still need to iterate through Hits
> > >
> > > Regards
> > > Michael
> > >
> > > On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> > > > Hello everyone,
> > > >
> > > > I have created a Lucene Index of Students Database, this
> > > database have
> > > > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > > > Now I have opened Searcher and query "Name:Menaria" , Its
> > > return 100
> > > > results. Now I wants the All unique "Class"  names which is
> > > return in
> > > > Hits objects, How can I get unique Class list without using Loop.
> > > >
> > > > Please suggest me..
> > > >
> > > > --
> > > > Thanks,
> > > > Laxmilal menaria
> > > >
> > > > http://www.minalyzer.com/
> > > > http://www.chambal.com/
> > > >
> > >
> >
>
>
>
> --
> Thanks,
> Laxmilal menaria
>
> http://www.minalyzer.com/
> http://www.chambal.com/
>
>


-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/


Re: Get all unique values of specific field

Posted by Laxmilal Menaria <lm...@chambal.com>.
Thanks Digy,

but if I index files i.e. xls, logs instead of database then how to find
distinct from Index ?


On 5/30/07, Digy <di...@gmail.com> wrote:
>
> Hi Laxmilal
>
> Why don't you use a database for your application. It seems like a DB
> application.
>
> DIGY
>
> -----Original Message-----
> From: Laxmilal Menaria [mailto:lmenaria@chambal.com]
> Sent: Wednesday, May 30, 2007 11:52 AM
> To: lucene-net-user@incubator.apache.org; java-user@lucene.apache.org
> Subject: Re: Get all unique values of specific field
>
> Thanks,
>
> Thats okay for short index, But if index have millions of records or GB's
> data then it will get slow . So what is better ?
>
> On 5/30/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
> >
> >
> > hi,
> >
> > I guess it doesn't work without any iteration.
> >
> > Another option coming to my mind: If the list of possible classnames
> isn't
> > too long, you could do (pseudocode)
> >
> >   Hashtable resultCount = new Hashtable();
> >   foreach( string classname in possibleClassNames )
> >   {
> >     resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
> >     resultCount[classname] = resultlist.Count;
> >   }
> >
> > cheers,
> > Erich
> >
> >
> > > -----Original Message-----
> > > From: Michael Mitiaguin [mailto:mitiaguin@gmail.com]
> > > Sent: Wednesday, May 30, 2007 9:33 AM
> > > To: lucene-net-user@incubator.apache.org
> > > Subject: Re: Get all unique values of specific field
> > >
> > > Laxmilal ,
> > >
> > > What's the problem with a hit list iteration ( it should be
> > > very fast  )  ?
> > > I am not sure about equivalent    of SQL "distinct"  in Lucene.
> > > You didn't describe whether you index ( and plus store )  all fields.
> > > Effectively you may just store a primary key ( let's say
> > > incremental int id  for the sake of example , though you
> > > table doesn't look normalised , it  doesn't really matter  in
> > > our case  )  which  will be stored not  indexed ,  the rest
> > > may be indexed as (  Name + Address + Class  ...) or as
> > > separate fields if you need to and either stored or not stored.
> > > Having applied search "Name:Menaria"  ( and having written
> > > all this I realised that you still need to iterate a hit list
> > >  but let me finish
> > > :) )  you may form a string with coma delimited IDs and then
> > >
> > > "select distinct class from mytable where where id in "(" +
> > > formedstring + ") "
> > >
> > > Surely if you store everything in Lucene index , there is no
> > > need to query database and you may use collections for
> > > picking up distinct "Class" values , but  my understanding
> > > you still need to iterate through Hits
> > >
> > > Regards
> > > Michael
> > >
> > > On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> > > > Hello everyone,
> > > >
> > > > I have created a Lucene Index of Students Database, this
> > > database have
> > > > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > > > Now I have opened Searcher and query "Name:Menaria" , Its
> > > return 100
> > > > results. Now I wants the All unique "Class"  names which is
> > > return in
> > > > Hits objects, How can I get unique Class list without using Loop.
> > > >
> > > > Please suggest me..
> > > >
> > > > --
> > > > Thanks,
> > > > Laxmilal menaria
> > > >
> > > > http://www.minalyzer.com/
> > > > http://www.chambal.com/
> > > >
> > >
> >
>
>
>
> --
> Thanks,
> Laxmilal menaria
>
> http://www.minalyzer.com/
> http://www.chambal.com/
>
>


-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

RE: Get all unique values of specific field

Posted by Digy <di...@gmail.com>.
Hi Laxmilal

Why don't you use a database for your application. It seems like a DB
application.

DIGY

-----Original Message-----
From: Laxmilal Menaria [mailto:lmenaria@chambal.com] 
Sent: Wednesday, May 30, 2007 11:52 AM
To: lucene-net-user@incubator.apache.org; java-user@lucene.apache.org
Subject: Re: Get all unique values of specific field

Thanks,

Thats okay for short index, But if index have millions of records or GB's
data then it will get slow . So what is better ?

On 5/30/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
>
>
> hi,
>
> I guess it doesn't work without any iteration.
>
> Another option coming to my mind: If the list of possible classnames isn't
> too long, you could do (pseudocode)
>
>   Hashtable resultCount = new Hashtable();
>   foreach( string classname in possibleClassNames )
>   {
>     resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
>     resultCount[classname] = resultlist.Count;
>   }
>
> cheers,
> Erich
>
>
> > -----Original Message-----
> > From: Michael Mitiaguin [mailto:mitiaguin@gmail.com]
> > Sent: Wednesday, May 30, 2007 9:33 AM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Re: Get all unique values of specific field
> >
> > Laxmilal ,
> >
> > What's the problem with a hit list iteration ( it should be
> > very fast  )  ?
> > I am not sure about equivalent    of SQL "distinct"  in Lucene.
> > You didn't describe whether you index ( and plus store )  all fields.
> > Effectively you may just store a primary key ( let's say
> > incremental int id  for the sake of example , though you
> > table doesn't look normalised , it  doesn't really matter  in
> > our case  )  which  will be stored not  indexed ,  the rest
> > may be indexed as (  Name + Address + Class  ...) or as
> > separate fields if you need to and either stored or not stored.
> > Having applied search "Name:Menaria"  ( and having written
> > all this I realised that you still need to iterate a hit list
> >  but let me finish
> > :) )  you may form a string with coma delimited IDs and then
> >
> > "select distinct class from mytable where where id in "(" +
> > formedstring + ") "
> >
> > Surely if you store everything in Lucene index , there is no
> > need to query database and you may use collections for
> > picking up distinct "Class" values , but  my understanding
> > you still need to iterate through Hits
> >
> > Regards
> > Michael
> >
> > On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> > > Hello everyone,
> > >
> > > I have created a Lucene Index of Students Database, this
> > database have
> > > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > > Now I have opened Searcher and query "Name:Menaria" , Its
> > return 100
> > > results. Now I wants the All unique "Class"  names which is
> > return in
> > > Hits objects, How can I get unique Class list without using Loop.
> > >
> > > Please suggest me..
> > >
> > > --
> > > Thanks,
> > > Laxmilal menaria
> > >
> > > http://www.minalyzer.com/
> > > http://www.chambal.com/
> > >
> >
>



-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/


Re: Get all unique values of specific field

Posted by Laxmilal Menaria <lm...@chambal.com>.
Thanks,

Thats okay for short index, But if index have millions of records or GB's
data then it will get slow . So what is better ?

On 5/30/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
>
>
> hi,
>
> I guess it doesn't work without any iteration.
>
> Another option coming to my mind: If the list of possible classnames isn't
> too long, you could do (pseudocode)
>
>   Hashtable resultCount = new Hashtable();
>   foreach( string classname in possibleClassNames )
>   {
>     resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
>     resultCount[classname] = resultlist.Count;
>   }
>
> cheers,
> Erich
>
>
> > -----Original Message-----
> > From: Michael Mitiaguin [mailto:mitiaguin@gmail.com]
> > Sent: Wednesday, May 30, 2007 9:33 AM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Re: Get all unique values of specific field
> >
> > Laxmilal ,
> >
> > What's the problem with a hit list iteration ( it should be
> > very fast  )  ?
> > I am not sure about equivalent    of SQL "distinct"  in Lucene.
> > You didn't describe whether you index ( and plus store )  all fields.
> > Effectively you may just store a primary key ( let's say
> > incremental int id  for the sake of example , though you
> > table doesn't look normalised , it  doesn't really matter  in
> > our case  )  which  will be stored not  indexed ,  the rest
> > may be indexed as (  Name + Address + Class  ...) or as
> > separate fields if you need to and either stored or not stored.
> > Having applied search "Name:Menaria"  ( and having written
> > all this I realised that you still need to iterate a hit list
> >  but let me finish
> > :) )  you may form a string with coma delimited IDs and then
> >
> > "select distinct class from mytable where where id in "(" +
> > formedstring + ") "
> >
> > Surely if you store everything in Lucene index , there is no
> > need to query database and you may use collections for
> > picking up distinct "Class" values , but  my understanding
> > you still need to iterate through Hits
> >
> > Regards
> > Michael
> >
> > On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> > > Hello everyone,
> > >
> > > I have created a Lucene Index of Students Database, this
> > database have
> > > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > > Now I have opened Searcher and query "Name:Menaria" , Its
> > return 100
> > > results. Now I wants the All unique "Class"  names which is
> > return in
> > > Hits objects, How can I get unique Class list without using Loop.
> > >
> > > Please suggest me..
> > >
> > > --
> > > Thanks,
> > > Laxmilal menaria
> > >
> > > http://www.minalyzer.com/
> > > http://www.chambal.com/
> > >
> >
>



-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

Re: Get all unique values of specific field

Posted by Laxmilal Menaria <lm...@chambal.com>.
Thanks,

Thats okay for short index, But if index have millions of records or GB's
data then it will get slow . So what is better ?

On 5/30/07, Erich Eichinger <E....@diamonddogs.cc> wrote:
>
>
> hi,
>
> I guess it doesn't work without any iteration.
>
> Another option coming to my mind: If the list of possible classnames isn't
> too long, you could do (pseudocode)
>
>   Hashtable resultCount = new Hashtable();
>   foreach( string classname in possibleClassNames )
>   {
>     resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
>     resultCount[classname] = resultlist.Count;
>   }
>
> cheers,
> Erich
>
>
> > -----Original Message-----
> > From: Michael Mitiaguin [mailto:mitiaguin@gmail.com]
> > Sent: Wednesday, May 30, 2007 9:33 AM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Re: Get all unique values of specific field
> >
> > Laxmilal ,
> >
> > What's the problem with a hit list iteration ( it should be
> > very fast  )  ?
> > I am not sure about equivalent    of SQL "distinct"  in Lucene.
> > You didn't describe whether you index ( and plus store )  all fields.
> > Effectively you may just store a primary key ( let's say
> > incremental int id  for the sake of example , though you
> > table doesn't look normalised , it  doesn't really matter  in
> > our case  )  which  will be stored not  indexed ,  the rest
> > may be indexed as (  Name + Address + Class  ...) or as
> > separate fields if you need to and either stored or not stored.
> > Having applied search "Name:Menaria"  ( and having written
> > all this I realised that you still need to iterate a hit list
> >  but let me finish
> > :) )  you may form a string with coma delimited IDs and then
> >
> > "select distinct class from mytable where where id in "(" +
> > formedstring + ") "
> >
> > Surely if you store everything in Lucene index , there is no
> > need to query database and you may use collections for
> > picking up distinct "Class" values , but  my understanding
> > you still need to iterate through Hits
> >
> > Regards
> > Michael
> >
> > On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> > > Hello everyone,
> > >
> > > I have created a Lucene Index of Students Database, this
> > database have
> > > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > > Now I have opened Searcher and query "Name:Menaria" , Its
> > return 100
> > > results. Now I wants the All unique "Class"  names which is
> > return in
> > > Hits objects, How can I get unique Class list without using Loop.
> > >
> > > Please suggest me..
> > >
> > > --
> > > Thanks,
> > > Laxmilal menaria
> > >
> > > http://www.minalyzer.com/
> > > http://www.chambal.com/
> > >
> >
>



-- 
Thanks,
Laxmilal menaria

http://www.minalyzer.com/
http://www.chambal.com/

RE: Get all unique values of specific field

Posted by Erich Eichinger <E....@diamonddogs.cc>.
hi,

I guess it doesn't work without any iteration.

Another option coming to my mind: If the list of possible classnames isn't too long, you could do (pseudocode)

  Hashtable resultCount = new Hashtable();
  foreach( string classname in possibleClassNames )
  {
    resultlist = index.SearchFor("Name:Menaria AND Class:"+classname)
    resultCount[classname] = resultlist.Count;
  }

cheers,
Erich


> -----Original Message-----
> From: Michael Mitiaguin [mailto:mitiaguin@gmail.com] 
> Sent: Wednesday, May 30, 2007 9:33 AM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Get all unique values of specific field
> 
> Laxmilal ,
> 
> What's the problem with a hit list iteration ( it should be 
> very fast  )  ?
> I am not sure about equivalent    of SQL "distinct"  in Lucene.
> You didn't describe whether you index ( and plus store )  all fields.
> Effectively you may just store a primary key ( let's say 
> incremental int id  for the sake of example , though you 
> table doesn't look normalised , it  doesn't really matter  in 
> our case  )  which  will be stored not  indexed ,  the rest 
> may be indexed as (  Name + Address + Class  ...) or as 
> separate fields if you need to and either stored or not stored.
> Having applied search "Name:Menaria"  ( and having written 
> all this I realised that you still need to iterate a hit list 
>  but let me finish
> :) )  you may form a string with coma delimited IDs and then
> 
> "select distinct class from mytable where where id in "(" + 
> formedstring + ") "
> 
> Surely if you store everything in Lucene index , there is no 
> need to query database and you may use collections for 
> picking up distinct "Class" values , but  my understanding 
> you still need to iterate through Hits
> 
> Regards
> Michael
> 
> On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> > Hello everyone,
> >
> > I have created a Lucene Index of Students Database, this 
> database have 
> > 5 fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> > Now I have opened Searcher and query "Name:Menaria" , Its 
> return 100 
> > results. Now I wants the All unique "Class"  names which is 
> return in 
> > Hits objects, How can I get unique Class list without using Loop.
> >
> > Please suggest me..
> >
> > --
> > Thanks,
> > Laxmilal menaria
> >
> > http://www.minalyzer.com/
> > http://www.chambal.com/
> >
> 

Re: Get all unique values of specific field

Posted by Michael Mitiaguin <mi...@gmail.com>.
Laxmilal ,

What's the problem with a hit list iteration ( it should be very fast  )  ?
I am not sure about equivalent    of SQL "distinct"  in Lucene.
You didn't describe whether you index ( and plus store )  all fields.
Effectively you may just store a primary key ( let's say incremental
int id  for the sake of example , though you table doesn't look
normalised , it  doesn't really matter  in our case  )  which  will be
stored not  indexed ,  the rest may be indexed as (  Name + Address +
Class  ...) or as separate fields if you need to and either stored or
not stored.
Having applied search "Name:Menaria"  ( and having written all this I
realised that you still need to iterate a hit list  but let me finish
:) )  you may form a string with coma delimited IDs and then

"select distinct class from mytable where where id in "(" +
formedstring + ") "

Surely if you store everything in Lucene index , there is no need to
query database and you may use collections for picking up distinct
"Class" values , but  my understanding you still need to iterate
through Hits

Regards
Michael

On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
> Hello everyone,
>
> I have created a Lucene Index of Students Database, this database have 5
> fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> Now I have opened Searcher and query "Name:Menaria" , Its return 100
> results. Now I wants the All unique "Class"  names which is return in Hits
> objects, How can I get unique Class list without using Loop.
>
> Please suggest me..
>
> --
> Thanks,
> Laxmilal menaria
>
> http://www.minalyzer.com/
> http://www.chambal.com/
>

Re: Get all unique values of specific field

Posted by Mohammad Norouzi <mn...@gmail.com>.
I'm not sure if this is fulfill your needs:
IndexReader.terms() which returns a TermEnum:

            TermEnum te = reader.terms();
            while(te.next()) {
                if(te.term().field().equals("Class")) {
                    System.out.println(te.term
().field()+":"+te.term().text());
                }
            }


On 5/30/07, Laxmilal Menaria <lm...@chambal.com> wrote:
>
> Hello everyone,
>
> I have created a Lucene Index of Students Database, this database have 5
> fields i.e. Name, Address, Class, PhoneNo and ScholarNo.
> Now I have opened Searcher and query "Name:Menaria" , Its return 100
> results. Now I wants the All unique "Class"  names which is return in Hits
> objects, How can I get unique Class list without using Loop.
>
> Please suggest me..
>
> --
> Thanks,
> Laxmilal menaria
>
> http://www.minalyzer.com/
> http://www.chambal.com/
>



-- 
Regards,
Mohammad
--------------------------
see my blog: http://brainable.blogspot.com/