You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by André Maldonado <an...@gmail.com> on 2009/10/30 17:57:07 UTC

Simple question

Hi.

This can be a simple question, but I can't figure out the solution.

I need to search my index in something like "SELECT TOP 5 ... ORDER BY
another_field". But this is an empty query because I want to search in all
documents.

How can I do it?

Thank's

Re: Simple question

Posted by André Maldonado <an...@gmail.com>.
LOL.. Thank's Franklin, I think this will be helpful.

Thank's again.

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> I did it again, think I'll hang it up for the day.  The correct query class
> name is 'MatchAllDocsQuery'.
>
> -----Original Message-----
> From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> Sent: Friday, October 30, 2009 2:06 PM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Simple question
>
> Oops, I'm not being very helpful.
>
> Use the MatchAllDocumentsQuery class:
>
> Searcher searcher = new IndexSearcher(directory);
>
> Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>
> Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
>
>
> However, that may be a lot of processing.  You may want to tune the query
> in a way to minimize overhead; someone else in the list may suggest a better
> strategy.
>
>
> -----Original Message-----
> From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> Sent: Friday, October 30, 2009 2:01 PM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Simple question
>
> Hi André,
>
> In this case you simply sort on the field. This may suffice:
>
> Searcher searcher = new IndexSearcher(directory);
>
> Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>
> Hits hits = searcher.search(query,sort);
>
>
> You can limit the number of hits (e.g. to 5), but I won't get into that
> here.
>
>
> Beyond SortField.AUTO, take a look at the SortField class to see specific
> field types - the most interesting being SortField.CUSTOM.
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 1:46 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> Hi Franklin.
>
> Wich query I use for this search (variable: query)? I don't want any query,
> I just want the TOP 5 documents ordered by a field.
>
> Thank's
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> fsimmons@sccmediaserver.com
> > wrote:
>
> > You can sort a search by multiple fields.  I think you could try
> something
> > like this:
> >
> > Searcher searcher = new IndexSearcher(directory);
> > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> > SortField("another_field") };
> > Hits hits = searcher.search(query,sort);
> >
> >
> > -----Original Message-----
> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> > Sent: Friday, October 30, 2009 12:57 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Simple question
> >
> > Hi.
> >
> > This can be a simple question, but I can't figure out the solution.
> >
> > I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> > another_field". But this is an empty query because I want to search in
> all
> > documents.
> >
> > How can I do it?
> >
> > Thank's
> >
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
André,

See the thread "How to loop through all the entries for a field" in the October 2009 list archive which illustrates the method IndexReader.Terms. This is the optimum choice since your case is quite specific.

The thread "Alternative to looping through Hits", also in the October 2009 list archive, is germane to sorting in the general case.

In my humble opinion the best way to learn is to get into the source, browse over the class documentation, and implement simple use tests.


-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Tuesday, November 03, 2009 11:29 AM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

This thread is getting big...

Franklin, I totally agree this approach can result in a problem, but I don't
know yet how to do this search with TermEnum. What is the basic
documentation to learn (I mean really learn, with all different query types)
about queries?

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Tue, Nov 3, 2009 at 13:33, Franklin Simmons
<fs...@sccmediaserver.com>wrote:

> André,
>
> You can pass null for the filter parameter. TopDocCollector and
> TopFieldDocCollector lets you limit hits. I have to say that while this
> approach may seem OK with a very small index it will become a major problem
> for you as index size grows, because MatchAllDocsQuery results in a sorting
> of all documents in the index having the sort field.  You should heed the
> advice offered by Digy earlier in this discussion to use TermEnum.
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Tuesday, November 03, 2009 8:50 AM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> When I do:
>
> Hits hits = searcher.Search(new *MatchAllDocsQuery()*, sort);
>
> The searcher return all documents. Can I return only the first 5 documents?
> Like a TOP 5 in SQL Server?
>
> Probably using searcher.Search(Query query, Filter filter, int n, Sort
> sort)
> I can do it, but I don't have a filter..
>
> How can I do it?
>
> Thank's
>
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> 2009/11/3 André Maldonado <an...@gmail.com>
>
> > Franklin, the error was exactly that.
> >
> > Some documents had a string where only an int can be. After made some
> code
> > adjustment, reindexing everything made it work.
> >
> >
> > Thank's
> >
> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >
> >
> > On Fri, Oct 30, 2009 at 18:19, Franklin Simmons <
> > fsimmons@sccmediaserver.com> wrote:
> >
> >> What type of data is represented by your field?
> >>
> >> There are any number of reasons why this could happen, such as using
> >> SortField.INT on a field with terms having non-digit characters.
> >>
> >> Without knowing specifics, I can only offer that you try
> SortField.STRING.
> >>
> >> -----Original Message-----
> >> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> Sent: Friday, October 30, 2009 3:47 PM
> >> To: lucene-net-user@incubator.apache.org
> >> Subject: Re: Simple question
> >>
> >> Hi again Franklin.
> >>
> >> Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what
> >> you
> >> said, I'm getting this error:
> >>
> >> System.FormatException: Input string was not in correct format.
> >>   em System.Number.StringToNumber(String str, NumberStyles options,
> >> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
> >>   em System.Number.ParseInt32(String s, NumberStyles style,
> >> NumberFormatInfo info)
> >>   em
> >> Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
> >> value_Renamed)
> >>   em
> >>
> >>
> Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
> >> reader, Object entryKey)
> >>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader,
> Object
> >> key)
> >>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
> >> field, IntParser parser)
> >>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
> >> field)
> >>   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
> >> reader, String fieldname)
> >>   em
> >>
> >>
> Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
> >> reader, Object entryKey)
> >>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader,
> Object
> >> key)
> >>   em
> Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
> >> reader, String field, Int32 type, CultureInfo locale,
> SortComparatorSource
> >> factory)
> >>   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
> >> SortField[] fields, Int32 size)
> >>   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter
> filter,
> >> Int32 nDocs, Sort sort)
> >>   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
> >>   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
> >>   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
> >>   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy,
> >> Int32
> >> type, Object analyzer) na
> >>
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> >> 252
> >>   em SearcherLibrary.Searcher.Search(String orderBy, sortType type,
> Int32
> >> hitCount) na
> >>
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> >> 313
> >>   em IndexerConsole.Program.Main(String[] args) na
> >> c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha
> 21
> >>
> >> Any idea?
> >>
> >> Thank's
> >>
> >> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
> És
> >> verdadeiramente o Filho de Deus." (Mateus 14:33)
> >>
> >>
> >> On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <
> >> fsimmons@sccmediaserver.com
> >> > wrote:
> >>
> >> > I did it again, think I'll hang it up for the day.  The correct query
> >> class
> >> > name is 'MatchAllDocsQuery'.
> >> >
> >> > -----Original Message-----
> >> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> >> > Sent: Friday, October 30, 2009 2:06 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: RE: Simple question
> >> >
> >> > Oops, I'm not being very helpful.
> >> >
> >> > Use the MatchAllDocumentsQuery class:
> >> >
> >> > Searcher searcher = new IndexSearcher(directory);
> >> >
> >> > Sort = new Sort(new SortField("another_field", SortField.AUTO,
> false));
> >> >
> >> > Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
> >> >
> >> >
> >> > However, that may be a lot of processing.  You may want to tune the
> >> query
> >> > in a way to minimize overhead; someone else in the list may suggest a
> >> better
> >> > strategy.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> >> > Sent: Friday, October 30, 2009 2:01 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: RE: Simple question
> >> >
> >> > Hi André,
> >> >
> >> > In this case you simply sort on the field. This may suffice:
> >> >
> >> > Searcher searcher = new IndexSearcher(directory);
> >> >
> >> > Sort = new Sort(new SortField("another_field", SortField.AUTO,
> false));
> >> >
> >> > Hits hits = searcher.search(query,sort);
> >> >
> >> >
> >> > You can limit the number of hits (e.g. to 5), but I won't get into
> that
> >> > here.
> >> >
> >> >
> >> > Beyond SortField.AUTO, take a look at the SortField class to see
> >> specific
> >> > field types - the most interesting being SortField.CUSTOM.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> > Sent: Friday, October 30, 2009 1:46 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: Re: Simple question
> >> >
> >> > Hi Franklin.
> >> >
> >> > Wich query I use for this search (variable: query)? I don't want any
> >> query,
> >> > I just want the TOP 5 documents ordered by a field.
> >> >
> >> > Thank's
> >> >
> >> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
> >> És
> >> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >> >
> >> >
> >> > On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> >> > fsimmons@sccmediaserver.com
> >> > > wrote:
> >> >
> >> > > You can sort a search by multiple fields.  I think you could try
> >> > something
> >> > > like this:
> >> > >
> >> > > Searcher searcher = new IndexSearcher(directory);
> >> > > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> >> > > SortField("another_field") };
> >> > > Hits hits = searcher.search(query,sort);
> >> > >
> >> > >
> >> > > -----Original Message-----
> >> > > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> > > Sent: Friday, October 30, 2009 12:57 PM
> >> > > To: lucene-net-user@incubator.apache.org
> >> > > Subject: Simple question
> >> > >
> >> > > Hi.
> >> > >
> >> > > This can be a simple question, but I can't figure out the solution.
> >> > >
> >> > > I need to search my index in something like "SELECT TOP 5 ... ORDER
> BY
> >> > > another_field". But this is an empty query because I want to search
> in
> >> > all
> >> > > documents.
> >> > >
> >> > > How can I do it?
> >> > >
> >> > > Thank's
> >> > >
> >> >
> >>
> >
> >
>

RE: Simple question

Posted by Digy <di...@gmail.com>.
Hi André,

>> This thread is getting big...

>> Franklin, I totally agree this approach can result in a problem, but I
don't
>> know yet how to do this search with TermEnum. What is the basic
>> documentation to learn (I mean really learn, with all different query
types)
>> about queries?


It seems that you have no time to look at the threads where people tries to
solve your problem, and but others have plenty of time to answer your
questions.

DIGY

-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Tuesday, November 03, 2009 6:29 PM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

This thread is getting big...

Franklin, I totally agree this approach can result in a problem, but I don't
know yet how to do this search with TermEnum. What is the basic
documentation to learn (I mean really learn, with all different query types)
about queries?

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Tue, Nov 3, 2009 at 13:33, Franklin Simmons
<fs...@sccmediaserver.com>wrote:

> André,
>
> You can pass null for the filter parameter. TopDocCollector and
> TopFieldDocCollector lets you limit hits. I have to say that while this
> approach may seem OK with a very small index it will become a major
problem
> for you as index size grows, because MatchAllDocsQuery results in a
sorting
> of all documents in the index having the sort field.  You should heed the
> advice offered by Digy earlier in this discussion to use TermEnum.
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Tuesday, November 03, 2009 8:50 AM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> When I do:
>
> Hits hits = searcher.Search(new *MatchAllDocsQuery()*, sort);
>
> The searcher return all documents. Can I return only the first 5
documents?
> Like a TOP 5 in SQL Server?
>
> Probably using searcher.Search(Query query, Filter filter, int n, Sort
> sort)
> I can do it, but I don't have a filter..
>
> How can I do it?
>
> Thank's
>
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> 2009/11/3 André Maldonado <an...@gmail.com>
>
> > Franklin, the error was exactly that.
> >
> > Some documents had a string where only an int can be. After made some
> code
> > adjustment, reindexing everything made it work.
> >
> >
> > Thank's
> >
> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
És
> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >
> >
> > On Fri, Oct 30, 2009 at 18:19, Franklin Simmons <
> > fsimmons@sccmediaserver.com> wrote:
> >
> >> What type of data is represented by your field?
> >>
> >> There are any number of reasons why this could happen, such as using
> >> SortField.INT on a field with terms having non-digit characters.
> >>
> >> Without knowing specifics, I can only offer that you try
> SortField.STRING.
> >>
> >> -----Original Message-----
> >> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> Sent: Friday, October 30, 2009 3:47 PM
> >> To: lucene-net-user@incubator.apache.org
> >> Subject: Re: Simple question
> >>
> >> Hi again Franklin.
> >>
> >> Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what
> >> you
> >> said, I'm getting this error:
> >>
> >> System.FormatException: Input string was not in correct format.
> >>   em System.Number.StringToNumber(String str, NumberStyles options,
> >> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
> >>   em System.Number.ParseInt32(String s, NumberStyles style,
> >> NumberFormatInfo info)
> >>   em
> >>
Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
> >> value_Renamed)
> >>   em
> >>
> >>
>
Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReade
r
> >> reader, Object entryKey)
> >>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader,
> Object
> >> key)
> >>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader,
String
> >> field, IntParser parser)
> >>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader,
String
> >> field)
> >>   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
> >> reader, String fieldname)
> >>   em
> >>
> >>
>
Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexR
eader
> >> reader, Object entryKey)
> >>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader,
> Object
> >> key)
> >>   em
> Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
> >> reader, String field, Int32 type, CultureInfo locale,
> SortComparatorSource
> >> factory)
> >>   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
> >> SortField[] fields, Int32 size)
> >>   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter
> filter,
> >> Int32 nDocs, Sort sort)
> >>   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
> >>   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort
o)
> >>   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
> >>   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy,
> >> Int32
> >> type, Object analyzer) na
> >>
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> >> 252
> >>   em SearcherLibrary.Searcher.Search(String orderBy, sortType type,
> Int32
> >> hitCount) na
> >>
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> >> 313
> >>   em IndexerConsole.Program.Main(String[] args) na
> >> c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha
> 21
> >>
> >> Any idea?
> >>
> >> Thank's
> >>
> >> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
> És
> >> verdadeiramente o Filho de Deus." (Mateus 14:33)
> >>
> >>
> >> On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <
> >> fsimmons@sccmediaserver.com
> >> > wrote:
> >>
> >> > I did it again, think I'll hang it up for the day.  The correct query
> >> class
> >> > name is 'MatchAllDocsQuery'.
> >> >
> >> > -----Original Message-----
> >> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> >> > Sent: Friday, October 30, 2009 2:06 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: RE: Simple question
> >> >
> >> > Oops, I'm not being very helpful.
> >> >
> >> > Use the MatchAllDocumentsQuery class:
> >> >
> >> > Searcher searcher = new IndexSearcher(directory);
> >> >
> >> > Sort = new Sort(new SortField("another_field", SortField.AUTO,
> false));
> >> >
> >> > Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
> >> >
> >> >
> >> > However, that may be a lot of processing.  You may want to tune the
> >> query
> >> > in a way to minimize overhead; someone else in the list may suggest a
> >> better
> >> > strategy.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> >> > Sent: Friday, October 30, 2009 2:01 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: RE: Simple question
> >> >
> >> > Hi André,
> >> >
> >> > In this case you simply sort on the field. This may suffice:
> >> >
> >> > Searcher searcher = new IndexSearcher(directory);
> >> >
> >> > Sort = new Sort(new SortField("another_field", SortField.AUTO,
> false));
> >> >
> >> > Hits hits = searcher.search(query,sort);
> >> >
> >> >
> >> > You can limit the number of hits (e.g. to 5), but I won't get into
> that
> >> > here.
> >> >
> >> >
> >> > Beyond SortField.AUTO, take a look at the SortField class to see
> >> specific
> >> > field types - the most interesting being SortField.CUSTOM.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> > Sent: Friday, October 30, 2009 1:46 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: Re: Simple question
> >> >
> >> > Hi Franklin.
> >> >
> >> > Wich query I use for this search (variable: query)? I don't want any
> >> query,
> >> > I just want the TOP 5 documents ordered by a field.
> >> >
> >> > Thank's
> >> >
> >> > "Então aproximaram-se os que estavam no barco, e adoraram-no,
dizendo:
> >> És
> >> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >> >
> >> >
> >> > On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> >> > fsimmons@sccmediaserver.com
> >> > > wrote:
> >> >
> >> > > You can sort a search by multiple fields.  I think you could try
> >> > something
> >> > > like this:
> >> > >
> >> > > Searcher searcher = new IndexSearcher(directory);
> >> > > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> >> > > SortField("another_field") };
> >> > > Hits hits = searcher.search(query,sort);
> >> > >
> >> > >
> >> > > -----Original Message-----
> >> > > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> > > Sent: Friday, October 30, 2009 12:57 PM
> >> > > To: lucene-net-user@incubator.apache.org
> >> > > Subject: Simple question
> >> > >
> >> > > Hi.
> >> > >
> >> > > This can be a simple question, but I can't figure out the solution.
> >> > >
> >> > > I need to search my index in something like "SELECT TOP 5 ... ORDER
> BY
> >> > > another_field". But this is an empty query because I want to search
> in
> >> > all
> >> > > documents.
> >> > >
> >> > > How can I do it?
> >> > >
> >> > > Thank's
> >> > >
> >> >
> >>
> >
> >
>


Re: Simple question

Posted by André Maldonado <an...@gmail.com>.
This thread is getting big...

Franklin, I totally agree this approach can result in a problem, but I don't
know yet how to do this search with TermEnum. What is the basic
documentation to learn (I mean really learn, with all different query types)
about queries?

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Tue, Nov 3, 2009 at 13:33, Franklin Simmons
<fs...@sccmediaserver.com>wrote:

> André,
>
> You can pass null for the filter parameter. TopDocCollector and
> TopFieldDocCollector lets you limit hits. I have to say that while this
> approach may seem OK with a very small index it will become a major problem
> for you as index size grows, because MatchAllDocsQuery results in a sorting
> of all documents in the index having the sort field.  You should heed the
> advice offered by Digy earlier in this discussion to use TermEnum.
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Tuesday, November 03, 2009 8:50 AM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> When I do:
>
> Hits hits = searcher.Search(new *MatchAllDocsQuery()*, sort);
>
> The searcher return all documents. Can I return only the first 5 documents?
> Like a TOP 5 in SQL Server?
>
> Probably using searcher.Search(Query query, Filter filter, int n, Sort
> sort)
> I can do it, but I don't have a filter..
>
> How can I do it?
>
> Thank's
>
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> 2009/11/3 André Maldonado <an...@gmail.com>
>
> > Franklin, the error was exactly that.
> >
> > Some documents had a string where only an int can be. After made some
> code
> > adjustment, reindexing everything made it work.
> >
> >
> > Thank's
> >
> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >
> >
> > On Fri, Oct 30, 2009 at 18:19, Franklin Simmons <
> > fsimmons@sccmediaserver.com> wrote:
> >
> >> What type of data is represented by your field?
> >>
> >> There are any number of reasons why this could happen, such as using
> >> SortField.INT on a field with terms having non-digit characters.
> >>
> >> Without knowing specifics, I can only offer that you try
> SortField.STRING.
> >>
> >> -----Original Message-----
> >> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> Sent: Friday, October 30, 2009 3:47 PM
> >> To: lucene-net-user@incubator.apache.org
> >> Subject: Re: Simple question
> >>
> >> Hi again Franklin.
> >>
> >> Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what
> >> you
> >> said, I'm getting this error:
> >>
> >> System.FormatException: Input string was not in correct format.
> >>   em System.Number.StringToNumber(String str, NumberStyles options,
> >> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
> >>   em System.Number.ParseInt32(String s, NumberStyles style,
> >> NumberFormatInfo info)
> >>   em
> >> Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
> >> value_Renamed)
> >>   em
> >>
> >>
> Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
> >> reader, Object entryKey)
> >>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader,
> Object
> >> key)
> >>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
> >> field, IntParser parser)
> >>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
> >> field)
> >>   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
> >> reader, String fieldname)
> >>   em
> >>
> >>
> Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
> >> reader, Object entryKey)
> >>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader,
> Object
> >> key)
> >>   em
> Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
> >> reader, String field, Int32 type, CultureInfo locale,
> SortComparatorSource
> >> factory)
> >>   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
> >> SortField[] fields, Int32 size)
> >>   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter
> filter,
> >> Int32 nDocs, Sort sort)
> >>   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
> >>   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
> >>   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
> >>   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy,
> >> Int32
> >> type, Object analyzer) na
> >>
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> >> 252
> >>   em SearcherLibrary.Searcher.Search(String orderBy, sortType type,
> Int32
> >> hitCount) na
> >>
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> >> 313
> >>   em IndexerConsole.Program.Main(String[] args) na
> >> c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha
> 21
> >>
> >> Any idea?
> >>
> >> Thank's
> >>
> >> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
> És
> >> verdadeiramente o Filho de Deus." (Mateus 14:33)
> >>
> >>
> >> On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <
> >> fsimmons@sccmediaserver.com
> >> > wrote:
> >>
> >> > I did it again, think I'll hang it up for the day.  The correct query
> >> class
> >> > name is 'MatchAllDocsQuery'.
> >> >
> >> > -----Original Message-----
> >> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> >> > Sent: Friday, October 30, 2009 2:06 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: RE: Simple question
> >> >
> >> > Oops, I'm not being very helpful.
> >> >
> >> > Use the MatchAllDocumentsQuery class:
> >> >
> >> > Searcher searcher = new IndexSearcher(directory);
> >> >
> >> > Sort = new Sort(new SortField("another_field", SortField.AUTO,
> false));
> >> >
> >> > Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
> >> >
> >> >
> >> > However, that may be a lot of processing.  You may want to tune the
> >> query
> >> > in a way to minimize overhead; someone else in the list may suggest a
> >> better
> >> > strategy.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> >> > Sent: Friday, October 30, 2009 2:01 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: RE: Simple question
> >> >
> >> > Hi André,
> >> >
> >> > In this case you simply sort on the field. This may suffice:
> >> >
> >> > Searcher searcher = new IndexSearcher(directory);
> >> >
> >> > Sort = new Sort(new SortField("another_field", SortField.AUTO,
> false));
> >> >
> >> > Hits hits = searcher.search(query,sort);
> >> >
> >> >
> >> > You can limit the number of hits (e.g. to 5), but I won't get into
> that
> >> > here.
> >> >
> >> >
> >> > Beyond SortField.AUTO, take a look at the SortField class to see
> >> specific
> >> > field types - the most interesting being SortField.CUSTOM.
> >> >
> >> >
> >> > -----Original Message-----
> >> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> > Sent: Friday, October 30, 2009 1:46 PM
> >> > To: lucene-net-user@incubator.apache.org
> >> > Subject: Re: Simple question
> >> >
> >> > Hi Franklin.
> >> >
> >> > Wich query I use for this search (variable: query)? I don't want any
> >> query,
> >> > I just want the TOP 5 documents ordered by a field.
> >> >
> >> > Thank's
> >> >
> >> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
> >> És
> >> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >> >
> >> >
> >> > On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> >> > fsimmons@sccmediaserver.com
> >> > > wrote:
> >> >
> >> > > You can sort a search by multiple fields.  I think you could try
> >> > something
> >> > > like this:
> >> > >
> >> > > Searcher searcher = new IndexSearcher(directory);
> >> > > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> >> > > SortField("another_field") };
> >> > > Hits hits = searcher.search(query,sort);
> >> > >
> >> > >
> >> > > -----Original Message-----
> >> > > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> >> > > Sent: Friday, October 30, 2009 12:57 PM
> >> > > To: lucene-net-user@incubator.apache.org
> >> > > Subject: Simple question
> >> > >
> >> > > Hi.
> >> > >
> >> > > This can be a simple question, but I can't figure out the solution.
> >> > >
> >> > > I need to search my index in something like "SELECT TOP 5 ... ORDER
> BY
> >> > > another_field". But this is an empty query because I want to search
> in
> >> > all
> >> > > documents.
> >> > >
> >> > > How can I do it?
> >> > >
> >> > > Thank's
> >> > >
> >> >
> >>
> >
> >
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
André,

You can pass null for the filter parameter. TopDocCollector and TopFieldDocCollector lets you limit hits. I have to say that while this approach may seem OK with a very small index it will become a major problem for you as index size grows, because MatchAllDocsQuery results in a sorting of all documents in the index having the sort field.  You should heed the advice offered by Digy earlier in this discussion to use TermEnum.  


-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Tuesday, November 03, 2009 8:50 AM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

When I do:

Hits hits = searcher.Search(new *MatchAllDocsQuery()*, sort);

The searcher return all documents. Can I return only the first 5 documents?
Like a TOP 5 in SQL Server?

Probably using searcher.Search(Query query, Filter filter, int n, Sort sort)
I can do it, but I don't have a filter..

How can I do it?

Thank's


"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


2009/11/3 André Maldonado <an...@gmail.com>

> Franklin, the error was exactly that.
>
> Some documents had a string where only an int can be. After made some code
> adjustment, reindexing everything made it work.
>
>
> Thank's
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> On Fri, Oct 30, 2009 at 18:19, Franklin Simmons <
> fsimmons@sccmediaserver.com> wrote:
>
>> What type of data is represented by your field?
>>
>> There are any number of reasons why this could happen, such as using
>> SortField.INT on a field with terms having non-digit characters.
>>
>> Without knowing specifics, I can only offer that you try SortField.STRING.
>>
>> -----Original Message-----
>> From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> Sent: Friday, October 30, 2009 3:47 PM
>> To: lucene-net-user@incubator.apache.org
>> Subject: Re: Simple question
>>
>> Hi again Franklin.
>>
>> Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what
>> you
>> said, I'm getting this error:
>>
>> System.FormatException: Input string was not in correct format.
>>   em System.Number.StringToNumber(String str, NumberStyles options,
>> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
>>   em System.Number.ParseInt32(String s, NumberStyles style,
>> NumberFormatInfo info)
>>   em
>> Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
>> value_Renamed)
>>   em
>>
>> Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
>> reader, Object entryKey)
>>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
>> key)
>>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
>> field, IntParser parser)
>>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
>> field)
>>   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
>> reader, String fieldname)
>>   em
>>
>> Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
>> reader, Object entryKey)
>>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
>> key)
>>   em Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
>> reader, String field, Int32 type, CultureInfo locale, SortComparatorSource
>> factory)
>>   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
>> SortField[] fields, Int32 size)
>>   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter filter,
>> Int32 nDocs, Sort sort)
>>   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
>>   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
>>   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
>>   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy,
>> Int32
>> type, Object analyzer) na
>> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
>> 252
>>   em SearcherLibrary.Searcher.Search(String orderBy, sortType type, Int32
>> hitCount) na
>> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
>> 313
>>   em IndexerConsole.Program.Main(String[] args) na
>> c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha 21
>>
>> Any idea?
>>
>> Thank's
>>
>> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
>> verdadeiramente o Filho de Deus." (Mateus 14:33)
>>
>>
>> On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <
>> fsimmons@sccmediaserver.com
>> > wrote:
>>
>> > I did it again, think I'll hang it up for the day.  The correct query
>> class
>> > name is 'MatchAllDocsQuery'.
>> >
>> > -----Original Message-----
>> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
>> > Sent: Friday, October 30, 2009 2:06 PM
>> > To: lucene-net-user@incubator.apache.org
>> > Subject: RE: Simple question
>> >
>> > Oops, I'm not being very helpful.
>> >
>> > Use the MatchAllDocumentsQuery class:
>> >
>> > Searcher searcher = new IndexSearcher(directory);
>> >
>> > Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>> >
>> > Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
>> >
>> >
>> > However, that may be a lot of processing.  You may want to tune the
>> query
>> > in a way to minimize overhead; someone else in the list may suggest a
>> better
>> > strategy.
>> >
>> >
>> > -----Original Message-----
>> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
>> > Sent: Friday, October 30, 2009 2:01 PM
>> > To: lucene-net-user@incubator.apache.org
>> > Subject: RE: Simple question
>> >
>> > Hi André,
>> >
>> > In this case you simply sort on the field. This may suffice:
>> >
>> > Searcher searcher = new IndexSearcher(directory);
>> >
>> > Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>> >
>> > Hits hits = searcher.search(query,sort);
>> >
>> >
>> > You can limit the number of hits (e.g. to 5), but I won't get into that
>> > here.
>> >
>> >
>> > Beyond SortField.AUTO, take a look at the SortField class to see
>> specific
>> > field types - the most interesting being SortField.CUSTOM.
>> >
>> >
>> > -----Original Message-----
>> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> > Sent: Friday, October 30, 2009 1:46 PM
>> > To: lucene-net-user@incubator.apache.org
>> > Subject: Re: Simple question
>> >
>> > Hi Franklin.
>> >
>> > Wich query I use for this search (variable: query)? I don't want any
>> query,
>> > I just want the TOP 5 documents ordered by a field.
>> >
>> > Thank's
>> >
>> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
>> És
>> > verdadeiramente o Filho de Deus." (Mateus 14:33)
>> >
>> >
>> > On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
>> > fsimmons@sccmediaserver.com
>> > > wrote:
>> >
>> > > You can sort a search by multiple fields.  I think you could try
>> > something
>> > > like this:
>> > >
>> > > Searcher searcher = new IndexSearcher(directory);
>> > > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
>> > > SortField("another_field") };
>> > > Hits hits = searcher.search(query,sort);
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> > > Sent: Friday, October 30, 2009 12:57 PM
>> > > To: lucene-net-user@incubator.apache.org
>> > > Subject: Simple question
>> > >
>> > > Hi.
>> > >
>> > > This can be a simple question, but I can't figure out the solution.
>> > >
>> > > I need to search my index in something like "SELECT TOP 5 ... ORDER BY
>> > > another_field". But this is an empty query because I want to search in
>> > all
>> > > documents.
>> > >
>> > > How can I do it?
>> > >
>> > > Thank's
>> > >
>> >
>>
>
>

Re: Simple question

Posted by André Maldonado <an...@gmail.com>.
When I do:

Hits hits = searcher.Search(new *MatchAllDocsQuery()*, sort);

The searcher return all documents. Can I return only the first 5 documents?
Like a TOP 5 in SQL Server?

Probably using searcher.Search(Query query, Filter filter, int n, Sort sort)
I can do it, but I don't have a filter..

How can I do it?

Thank's


"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


2009/11/3 André Maldonado <an...@gmail.com>

> Franklin, the error was exactly that.
>
> Some documents had a string where only an int can be. After made some code
> adjustment, reindexing everything made it work.
>
>
> Thank's
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> On Fri, Oct 30, 2009 at 18:19, Franklin Simmons <
> fsimmons@sccmediaserver.com> wrote:
>
>> What type of data is represented by your field?
>>
>> There are any number of reasons why this could happen, such as using
>> SortField.INT on a field with terms having non-digit characters.
>>
>> Without knowing specifics, I can only offer that you try SortField.STRING.
>>
>> -----Original Message-----
>> From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> Sent: Friday, October 30, 2009 3:47 PM
>> To: lucene-net-user@incubator.apache.org
>> Subject: Re: Simple question
>>
>> Hi again Franklin.
>>
>> Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what
>> you
>> said, I'm getting this error:
>>
>> System.FormatException: Input string was not in correct format.
>>   em System.Number.StringToNumber(String str, NumberStyles options,
>> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
>>   em System.Number.ParseInt32(String s, NumberStyles style,
>> NumberFormatInfo info)
>>   em
>> Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
>> value_Renamed)
>>   em
>>
>> Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
>> reader, Object entryKey)
>>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
>> key)
>>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
>> field, IntParser parser)
>>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
>> field)
>>   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
>> reader, String fieldname)
>>   em
>>
>> Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
>> reader, Object entryKey)
>>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
>> key)
>>   em Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
>> reader, String field, Int32 type, CultureInfo locale, SortComparatorSource
>> factory)
>>   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
>> SortField[] fields, Int32 size)
>>   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter filter,
>> Int32 nDocs, Sort sort)
>>   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
>>   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
>>   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
>>   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy,
>> Int32
>> type, Object analyzer) na
>> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
>> 252
>>   em SearcherLibrary.Searcher.Search(String orderBy, sortType type, Int32
>> hitCount) na
>> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
>> 313
>>   em IndexerConsole.Program.Main(String[] args) na
>> c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha 21
>>
>> Any idea?
>>
>> Thank's
>>
>> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
>> verdadeiramente o Filho de Deus." (Mateus 14:33)
>>
>>
>> On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <
>> fsimmons@sccmediaserver.com
>> > wrote:
>>
>> > I did it again, think I'll hang it up for the day.  The correct query
>> class
>> > name is 'MatchAllDocsQuery'.
>> >
>> > -----Original Message-----
>> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
>> > Sent: Friday, October 30, 2009 2:06 PM
>> > To: lucene-net-user@incubator.apache.org
>> > Subject: RE: Simple question
>> >
>> > Oops, I'm not being very helpful.
>> >
>> > Use the MatchAllDocumentsQuery class:
>> >
>> > Searcher searcher = new IndexSearcher(directory);
>> >
>> > Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>> >
>> > Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
>> >
>> >
>> > However, that may be a lot of processing.  You may want to tune the
>> query
>> > in a way to minimize overhead; someone else in the list may suggest a
>> better
>> > strategy.
>> >
>> >
>> > -----Original Message-----
>> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
>> > Sent: Friday, October 30, 2009 2:01 PM
>> > To: lucene-net-user@incubator.apache.org
>> > Subject: RE: Simple question
>> >
>> > Hi André,
>> >
>> > In this case you simply sort on the field. This may suffice:
>> >
>> > Searcher searcher = new IndexSearcher(directory);
>> >
>> > Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>> >
>> > Hits hits = searcher.search(query,sort);
>> >
>> >
>> > You can limit the number of hits (e.g. to 5), but I won't get into that
>> > here.
>> >
>> >
>> > Beyond SortField.AUTO, take a look at the SortField class to see
>> specific
>> > field types - the most interesting being SortField.CUSTOM.
>> >
>> >
>> > -----Original Message-----
>> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> > Sent: Friday, October 30, 2009 1:46 PM
>> > To: lucene-net-user@incubator.apache.org
>> > Subject: Re: Simple question
>> >
>> > Hi Franklin.
>> >
>> > Wich query I use for this search (variable: query)? I don't want any
>> query,
>> > I just want the TOP 5 documents ordered by a field.
>> >
>> > Thank's
>> >
>> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo:
>> És
>> > verdadeiramente o Filho de Deus." (Mateus 14:33)
>> >
>> >
>> > On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
>> > fsimmons@sccmediaserver.com
>> > > wrote:
>> >
>> > > You can sort a search by multiple fields.  I think you could try
>> > something
>> > > like this:
>> > >
>> > > Searcher searcher = new IndexSearcher(directory);
>> > > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
>> > > SortField("another_field") };
>> > > Hits hits = searcher.search(query,sort);
>> > >
>> > >
>> > > -----Original Message-----
>> > > From: André Maldonado [mailto:andre.maldonado@gmail.com]
>> > > Sent: Friday, October 30, 2009 12:57 PM
>> > > To: lucene-net-user@incubator.apache.org
>> > > Subject: Simple question
>> > >
>> > > Hi.
>> > >
>> > > This can be a simple question, but I can't figure out the solution.
>> > >
>> > > I need to search my index in something like "SELECT TOP 5 ... ORDER BY
>> > > another_field". But this is an empty query because I want to search in
>> > all
>> > > documents.
>> > >
>> > > How can I do it?
>> > >
>> > > Thank's
>> > >
>> >
>>
>
>

Re: Simple question

Posted by André Maldonado <an...@gmail.com>.
Franklin, the error was exactly that.

Some documents had a string where only an int can be. After made some code
adjustment, reindexing everything made it work.

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 18:19, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> What type of data is represented by your field?
>
> There are any number of reasons why this could happen, such as using
> SortField.INT on a field with terms having non-digit characters.
>
> Without knowing specifics, I can only offer that you try SortField.STRING.
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 3:47 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> Hi again Franklin.
>
> Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what you
> said, I'm getting this error:
>
> System.FormatException: Input string was not in correct format.
>   em System.Number.StringToNumber(String str, NumberStyles options,
> NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
>   em System.Number.ParseInt32(String s, NumberStyles style,
> NumberFormatInfo info)
>   em
> Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
> value_Renamed)
>   em
>
> Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
> reader, Object entryKey)
>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
> key)
>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
> field, IntParser parser)
>   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
> field)
>   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
> reader, String fieldname)
>   em
>
> Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
> reader, Object entryKey)
>   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
> key)
>   em Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
> reader, String field, Int32 type, CultureInfo locale, SortComparatorSource
> factory)
>   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
> SortField[] fields, Int32 size)
>   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter filter,
> Int32 nDocs, Sort sort)
>   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
>   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
>   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
>   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy, Int32
> type, Object analyzer) na
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> 252
>   em SearcherLibrary.Searcher.Search(String orderBy, sortType type, Int32
> hitCount) na
> c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
> 313
>   em IndexerConsole.Program.Main(String[] args) na
> c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha 21
>
> Any idea?
>
> Thank's
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <
> fsimmons@sccmediaserver.com
> > wrote:
>
> > I did it again, think I'll hang it up for the day.  The correct query
> class
> > name is 'MatchAllDocsQuery'.
> >
> > -----Original Message-----
> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> > Sent: Friday, October 30, 2009 2:06 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: RE: Simple question
> >
> > Oops, I'm not being very helpful.
> >
> > Use the MatchAllDocumentsQuery class:
> >
> > Searcher searcher = new IndexSearcher(directory);
> >
> > Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
> >
> > Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
> >
> >
> > However, that may be a lot of processing.  You may want to tune the query
> > in a way to minimize overhead; someone else in the list may suggest a
> better
> > strategy.
> >
> >
> > -----Original Message-----
> > From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> > Sent: Friday, October 30, 2009 2:01 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: RE: Simple question
> >
> > Hi André,
> >
> > In this case you simply sort on the field. This may suffice:
> >
> > Searcher searcher = new IndexSearcher(directory);
> >
> > Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
> >
> > Hits hits = searcher.search(query,sort);
> >
> >
> > You can limit the number of hits (e.g. to 5), but I won't get into that
> > here.
> >
> >
> > Beyond SortField.AUTO, take a look at the SortField class to see specific
> > field types - the most interesting being SortField.CUSTOM.
> >
> >
> > -----Original Message-----
> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> > Sent: Friday, October 30, 2009 1:46 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Re: Simple question
> >
> > Hi Franklin.
> >
> > Wich query I use for this search (variable: query)? I don't want any
> query,
> > I just want the TOP 5 documents ordered by a field.
> >
> > Thank's
> >
> > "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> > verdadeiramente o Filho de Deus." (Mateus 14:33)
> >
> >
> > On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> > fsimmons@sccmediaserver.com
> > > wrote:
> >
> > > You can sort a search by multiple fields.  I think you could try
> > something
> > > like this:
> > >
> > > Searcher searcher = new IndexSearcher(directory);
> > > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> > > SortField("another_field") };
> > > Hits hits = searcher.search(query,sort);
> > >
> > >
> > > -----Original Message-----
> > > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> > > Sent: Friday, October 30, 2009 12:57 PM
> > > To: lucene-net-user@incubator.apache.org
> > > Subject: Simple question
> > >
> > > Hi.
> > >
> > > This can be a simple question, but I can't figure out the solution.
> > >
> > > I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> > > another_field". But this is an empty query because I want to search in
> > all
> > > documents.
> > >
> > > How can I do it?
> > >
> > > Thank's
> > >
> >
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
What type of data is represented by your field?

There are any number of reasons why this could happen, such as using SortField.INT on a field with terms having non-digit characters.

Without knowing specifics, I can only offer that you try SortField.STRING.

-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Friday, October 30, 2009 3:47 PM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

Hi again Franklin.

Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what you
said, I'm getting this error:

System.FormatException: Input string was not in correct format.
   em System.Number.StringToNumber(String str, NumberStyles options,
NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
   em System.Number.ParseInt32(String s, NumberStyles style,
NumberFormatInfo info)
   em
Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
value_Renamed)
   em
Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
reader, Object entryKey)
   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
key)
   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
field, IntParser parser)
   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
field)
   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
reader, String fieldname)
   em
Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
reader, Object entryKey)
   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
key)
   em Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
reader, String field, Int32 type, CultureInfo locale, SortComparatorSource
factory)
   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
SortField[] fields, Int32 size)
   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter filter,
Int32 nDocs, Sort sort)
   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy, Int32
type, Object analyzer) na
c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
252
   em SearcherLibrary.Searcher.Search(String orderBy, sortType type, Int32
hitCount) na
c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
313
   em IndexerConsole.Program.Main(String[] args) na
c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha 21

Any idea?

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> I did it again, think I'll hang it up for the day.  The correct query class
> name is 'MatchAllDocsQuery'.
>
> -----Original Message-----
> From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> Sent: Friday, October 30, 2009 2:06 PM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Simple question
>
> Oops, I'm not being very helpful.
>
> Use the MatchAllDocumentsQuery class:
>
> Searcher searcher = new IndexSearcher(directory);
>
> Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>
> Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
>
>
> However, that may be a lot of processing.  You may want to tune the query
> in a way to minimize overhead; someone else in the list may suggest a better
> strategy.
>
>
> -----Original Message-----
> From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> Sent: Friday, October 30, 2009 2:01 PM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Simple question
>
> Hi André,
>
> In this case you simply sort on the field. This may suffice:
>
> Searcher searcher = new IndexSearcher(directory);
>
> Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>
> Hits hits = searcher.search(query,sort);
>
>
> You can limit the number of hits (e.g. to 5), but I won't get into that
> here.
>
>
> Beyond SortField.AUTO, take a look at the SortField class to see specific
> field types - the most interesting being SortField.CUSTOM.
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 1:46 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> Hi Franklin.
>
> Wich query I use for this search (variable: query)? I don't want any query,
> I just want the TOP 5 documents ordered by a field.
>
> Thank's
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> fsimmons@sccmediaserver.com
> > wrote:
>
> > You can sort a search by multiple fields.  I think you could try
> something
> > like this:
> >
> > Searcher searcher = new IndexSearcher(directory);
> > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> > SortField("another_field") };
> > Hits hits = searcher.search(query,sort);
> >
> >
> > -----Original Message-----
> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> > Sent: Friday, October 30, 2009 12:57 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Simple question
> >
> > Hi.
> >
> > This can be a simple question, but I can't figure out the solution.
> >
> > I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> > another_field". But this is an empty query because I want to search in
> all
> > documents.
> >
> > How can I do it?
> >
> > Thank's
> >
>

RE: Best way to store book information

Posted by Digy <di...@gmail.com>.
>How does that work though if i dont store the "Content" to yes?  If I dont
store it then i cant search from it can I?.  What I do is search the
"Content" and use the "Title" and "File" to retrieve the actual html page
which is in a directory path.  Can I still search in the "Content" if i dont
store it?

YES. 

> Should I use Vectors also when storing?  If so which one?
NO NEED. 

>Will TermEnum work for searching like "SQL Server database tuning" as a
search?
>Do you happen to have an example on doing a search using TermEnum?

I have no idea about " SQL Server database tuning ". But TermEnum can be
used to show the alternatives while the user is typing a word(if it is what
you are asking). See the discusssion "Alternative to looping through Hits".

DIGY




-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:54 PM
To: lucene-net-user@incubator.apache.org
Subject: RE: Best way to store book information

How does that work though if i dont store the "Content" to yes?  If I dont
store it then i cant search from it can I?.  What I do is search the
"Content" and use the "Title" and "File" to retrieve the actual html page
which is in a directory path.  Can I still search in the "Content" if i dont
store it?
 
Should I use Vectors also when storing?  If so which one?
 
Will TermEnum work for searching like "SQL Server database tuning" as a
search?
Do you happen to have an example on doing a search using TermEnum?


>>> 

From: "Digy" <di...@gmail.com>
To:<lu...@incubator.apache.org>
Date: 10/30/2009 1:41 PM
Subject: RE: Best way to store book information
1. If you want to return the field's content to the user then use
"Store.YES", otherwise no need to store it. 
In your case, "Content" can be as "Store.NO" since whole html doc is rarely
returned to the user.
2. if you want to give some "priority" to a specific field/term then use
boosting. For ex, some html pages thought to be important can be boosted.
3. Use TermEnum

DIGY


-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:17 PM
To: lucene-net-user@incubator.apache.org 
Subject: Best way to store book information

I have countless articles in html pages and i'm importing them and parsing
out the text only for my searching.  My question is what is the best way to
store the "Content"?

                                doc = new Document();
                                doc.Add(new Field("Title", title,
Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page,
Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);


                                doc.Add(new Field("Content", content.Trim(),
Field.Store.YES, Field.Index.TOKENIZED));
                                writer.AddDocument(doc);

I'm only searching the "Content" portion not the other two.  So my questions
are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you
type in google?  

Thanks


!DSPAM:4aeb4e1d494461881617585!




RE: Best way to store book information

Posted by Eric Advincula <Er...@co.mohave.az.us>.
How does that work though if i dont store the "Content" to yes?  If I dont store it then i cant search from it can I?.  What I do is search the "Content" and use the "Title" and "File" to retrieve the actual html page which is in a directory path.  Can I still search in the "Content" if i dont store it?
 
Should I use Vectors also when storing?  If so which one?
 
Will TermEnum work for searching like "SQL Server database tuning" as a search?
Do you happen to have an example on doing a search using TermEnum?


>>> 

From: "Digy" <di...@gmail.com>
To:<lu...@incubator.apache.org>
Date: 10/30/2009 1:41 PM
Subject: RE: Best way to store book information
1. If you want to return the field's content to the user then use
"Store.YES", otherwise no need to store it. 
In your case, "Content" can be as "Store.NO" since whole html doc is rarely
returned to the user.
2. if you want to give some "priority" to a specific field/term then use
boosting. For ex, some html pages thought to be important can be boosted.
3. Use TermEnum

DIGY


-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:17 PM
To: lucene-net-user@incubator.apache.org 
Subject: Best way to store book information

I have countless articles in html pages and i'm importing them and parsing
out the text only for my searching.  My question is what is the best way to
store the "Content"?

                                doc = new Document();
                                doc.Add(new Field("Title", title,
Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page,
Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);


                                doc.Add(new Field("Content", content.Trim(),
Field.Store.YES, Field.Index.TOKENIZED));
                                writer.AddDocument(doc);

I'm only searching the "Content" portion not the other two.  So my questions
are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you
type in google?  

Thanks


!DSPAM:4aeb4e1d494461881617585!



RE: Excessive IOExceptions in IndexSearcher/QueryParser/FastCharStream?

Posted by Digy <di...@gmail.com>.
It is an expected behaviour inhereted from Lucene.Java and I haven't seen a
(remarkable) performance degrade because of this.

DIGY.

-----Original Message-----
From: Ron Grabowski [mailto:rongrabowski@yahoo.com] 
Sent: Saturday, October 31, 2009 12:52 AM
To: lucene-net-user@incubator.apache.org
Subject: Re: Excessive IOExceptions in
IndexSearcher/QueryParser/FastCharStream?

I'm using
https://svn.apache.org/repos/asf/incubator/lucene.net/tags/Lucene.Net_2_4_0/
src/Lucene.Net.



----- Original Message ----
From: Ron Grabowski <ro...@yahoo.com>
To: lucene-net-user@incubator.apache.org
Sent: Fri, October 30, 2009 6:44:34 PM
Subject: Excessive IOExceptions in IndexSearcher/QueryParser/FastCharStream?

I was profiling my search code and saw an awful lot of Exceptions being
throw for simple usages of QueryParser:

http://www.ronosaurus.com/lucene/indexsearcher_queryparser_ioexception.png

For example this code produces 2 IOExceptions in FastCharStream (line 25):

QueryParser parser = new QueryParser("name", new StandardAnalyzer());
parser.Parse("produce");

Is that normal? In my screenshot there's close to 100 Exceptions within 15
seconds of running some threaded searches.  Would the FastCharStream be
faster if it didn't throw so many Exceptions? I tried hacking CanRead() into
CharStream but didn't get very far.


Re: Excessive IOExceptions in IndexSearcher/QueryParser/FastCharStream?

Posted by Ron Grabowski <ro...@yahoo.com>.
I'm using https://svn.apache.org/repos/asf/incubator/lucene.net/tags/Lucene.Net_2_4_0/src/Lucene.Net.



----- Original Message ----
From: Ron Grabowski <ro...@yahoo.com>
To: lucene-net-user@incubator.apache.org
Sent: Fri, October 30, 2009 6:44:34 PM
Subject: Excessive IOExceptions in IndexSearcher/QueryParser/FastCharStream?

I was profiling my search code and saw an awful lot of Exceptions being throw for simple usages of QueryParser:

http://www.ronosaurus.com/lucene/indexsearcher_queryparser_ioexception.png

For example this code produces 2 IOExceptions in FastCharStream (line 25):

QueryParser parser = new QueryParser("name", new StandardAnalyzer());
parser.Parse("produce");

Is that normal? In my screenshot there's close to 100 Exceptions within 15 seconds of running some threaded searches.  Would the FastCharStream be faster if it didn't throw so many Exceptions? I tried hacking CanRead() into CharStream but didn't get very far.

Excessive IOExceptions in IndexSearcher/QueryParser/FastCharStream?

Posted by Ron Grabowski <ro...@yahoo.com>.
I was profiling my search code and saw an awful lot of Exceptions being throw for simple usages of QueryParser:

 http://www.ronosaurus.com/lucene/indexsearcher_queryparser_ioexception.png

For example this code produces 2 IOExceptions in FastCharStream (line 25):

 QueryParser parser = new QueryParser("name", new StandardAnalyzer());
 parser.Parse("produce");

Is that normal? In my screenshot there's close to 100 Exceptions within 15 seconds of running some threaded searches.  Would the FastCharStream be faster if it didn't throw so many Exceptions? I tried hacking CanRead() into CharStream but didn't get very far.


RE:

Posted by Digy <di...@gmail.com>.
No. Use classical IndexSearcher's search function with a query something
like "search that phrase". (use quotation marks).


DIGY

-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 11:11 PM
To: lucene-net-user@incubator.apache.org
Subject: Re:

Thanks,  What I mean about " SQL Server database tuning " was if i type that
as a phrase I want to search for.  Or any kind of phrase that I would like
to search on not just one word searches but entire phrases.  Would TermEnums
still work?
 


>>> 

From: "Digy" <di...@gmail.com>
To:<lu...@incubator.apache.org>
Date: 10/30/2009 2:07 PM
mohave.az.us>
In-Reply-To: <4A...@co.mohave.az.us>
Subject: RE: Best way to store book information
Date: Fri, 30 Oct 2009 23:05:18 +0200
Message-ID: <00...@com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcpZo0XaTMlfFFugRWe+iJEwSQL44gAADsWQ
Content-Language: tr
X-Virus-Checked: Checked by ClamAV on apache.org
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Fri Oct 30 14:01:32 2009
X-DSPAM-Confidence: 0.9899
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 4aeb542c501105209328925
X-DSPAM-Factors: 27,
List-Post*net, 0.01000,
content+to, 0.01000,
Content-Type*charset="us, 0.01000,
and+i'm, 0.01000,
X-Spam-Status*8.0, 0.01000,
List-Id*net+user.incubator.apache.org>, 0.01000,
Received*(hermes.apache.org+[140.211.11.3]), 0.01000,
Received*co.mohave.az.us>, 0.01000,
Subject*RE, 0.01000,
Received-SPF*(nike.apache.org, 0.01000,
Delivered-To*lucene, 0.01000,
have+no, 0.01000,
an, 0.01000,
an, 0.01000,
importing, 0.01000,
Received*(Postfix+from, 0.01000,
10, 0.01000,
10, 0.01000,
Index, 0.01000,
Index, 0.01000,
Subject*information, 0.01000,
doing+a, 0.01000,
doing+a, 0.01000,
org, 0.01000,
org, 0.01000,
In-Reply-To*co.mohave.az.us>, 0.01000,
What+is, 0.01000

>How does that work though if i dont store the "Content" to yes?  If I dont
store it then i cant search from it can I?.  What I do is search the
"Content" and use the "Title" and "File" to retrieve the actual html page
which is in a directory path.  Can I still search in the "Content" if i dont
store it?

YES. 

> Should I use Vectors also when storing?  If so which one?
NO NEED. 

>Will TermEnum work for searching like "SQL Server database tuning" as a
search?
>Do you happen to have an example on doing a search using TermEnum?

I have no idea about " SQL Server database tuning ". But TermEnum can be
used to show the alternatives while the user is typing a word(if it is what
you are asking). See the discusssion "Alternative to looping through Hits".

DIGY




-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:54 PM
To: lucene-net-user@incubator.apache.org 
Subject: RE: Best way to store book information

How does that work though if i dont store the "Content" to yes?  If I dont
store it then i cant search from it can I?.  What I do is search the
"Content" and use the "Title" and "File" to retrieve the actual html page
which is in a directory path.  Can I still search in the "Content" if i dont
store it?

Should I use Vectors also when storing?  If so which one?

Will TermEnum work for searching like "SQL Server database tuning" as a
search?
Do you happen to have an example on doing a search using TermEnum?


>>> 

From: "Digy" <di...@gmail.com>
To:<lu...@incubator.apache.org>
Date: 10/30/2009 1:41 PM
Subject: RE: Best way to store book information
1. If you want to return the field's content to the user then use
"Store.YES", otherwise no need to store it. 
In your case, "Content" can be as "Store.NO" since whole html doc is rarely
returned to the user.
2. if you want to give some "priority" to a specific field/term then use
boosting. For ex, some html pages thought to be important can be boosted.
3. Use TermEnum

DIGY


-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:17 PM
To: lucene-net-user@incubator.apache.org 
Subject: Best way to store book information

I have countless articles in html pages and i'm importing them and parsing
out the text only for my searching.  My question is what is the best way to
store the "Content"?

                                doc = new Document();
                                doc.Add(new Field("Title", title,
Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page,
Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);


                                doc.Add(new Field("Content", content.Trim(),
Field.Store.YES, Field.Index.TOKENIZED));
                                writer.AddDocument(doc);

I'm only searching the "Content" portion not the other two.  So my questions
are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you
type in google?  

Thanks







!DSPAM:4aeb542c501105209328925!




Re:

Posted by Eric Advincula <Er...@co.mohave.az.us>.
Thanks,  What I mean about " SQL Server database tuning " was if i type that as a phrase I want to search for.  Or any kind of phrase that I would like to search on not just one word searches but entire phrases.  Would TermEnums still work?
 


>>> 

From: "Digy" <di...@gmail.com>
To:<lu...@incubator.apache.org>
Date: 10/30/2009 2:07 PM
mohave.az.us>
In-Reply-To: <4A...@co.mohave.az.us>
Subject: RE: Best way to store book information
Date: Fri, 30 Oct 2009 23:05:18 +0200
Message-ID: <00...@com>
MIME-Version: 1.0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: AcpZo0XaTMlfFFugRWe+iJEwSQL44gAADsWQ
Content-Language: tr
X-Virus-Checked: Checked by ClamAV on apache.org
X-DSPAM-Result: Innocent
X-DSPAM-Processed: Fri Oct 30 14:01:32 2009
X-DSPAM-Confidence: 0.9899
X-DSPAM-Probability: 0.0000
X-DSPAM-Signature: 4aeb542c501105209328925
X-DSPAM-Factors: 27,
List-Post*net, 0.01000,
content+to, 0.01000,
Content-Type*charset="us, 0.01000,
and+i'm, 0.01000,
X-Spam-Status*8.0, 0.01000,
List-Id*net+user.incubator.apache.org>, 0.01000,
Received*(hermes.apache.org+[140.211.11.3]), 0.01000,
Received*co.mohave.az.us>, 0.01000,
Subject*RE, 0.01000,
Received-SPF*(nike.apache.org, 0.01000,
Delivered-To*lucene, 0.01000,
have+no, 0.01000,
an, 0.01000,
an, 0.01000,
importing, 0.01000,
Received*(Postfix+from, 0.01000,
10, 0.01000,
10, 0.01000,
Index, 0.01000,
Index, 0.01000,
Subject*information, 0.01000,
doing+a, 0.01000,
doing+a, 0.01000,
org, 0.01000,
org, 0.01000,
In-Reply-To*co.mohave.az.us>, 0.01000,
What+is, 0.01000

>How does that work though if i dont store the "Content" to yes?  If I dont
store it then i cant search from it can I?.  What I do is search the
"Content" and use the "Title" and "File" to retrieve the actual html page
which is in a directory path.  Can I still search in the "Content" if i dont
store it?

YES. 

> Should I use Vectors also when storing?  If so which one?
NO NEED. 

>Will TermEnum work for searching like "SQL Server database tuning" as a
search?
>Do you happen to have an example on doing a search using TermEnum?

I have no idea about " SQL Server database tuning ". But TermEnum can be
used to show the alternatives while the user is typing a word(if it is what
you are asking). See the discusssion "Alternative to looping through Hits".

DIGY




-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:54 PM
To: lucene-net-user@incubator.apache.org 
Subject: RE: Best way to store book information

How does that work though if i dont store the "Content" to yes?  If I dont
store it then i cant search from it can I?.  What I do is search the
"Content" and use the "Title" and "File" to retrieve the actual html page
which is in a directory path.  Can I still search in the "Content" if i dont
store it?

Should I use Vectors also when storing?  If so which one?

Will TermEnum work for searching like "SQL Server database tuning" as a
search?
Do you happen to have an example on doing a search using TermEnum?


>>> 

From: "Digy" <di...@gmail.com>
To:<lu...@incubator.apache.org>
Date: 10/30/2009 1:41 PM
Subject: RE: Best way to store book information
1. If you want to return the field's content to the user then use
"Store.YES", otherwise no need to store it. 
In your case, "Content" can be as "Store.NO" since whole html doc is rarely
returned to the user.
2. if you want to give some "priority" to a specific field/term then use
boosting. For ex, some html pages thought to be important can be boosted.
3. Use TermEnum

DIGY


-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:17 PM
To: lucene-net-user@incubator.apache.org 
Subject: Best way to store book information

I have countless articles in html pages and i'm importing them and parsing
out the text only for my searching.  My question is what is the best way to
store the "Content"?

                                doc = new Document();
                                doc.Add(new Field("Title", title,
Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page,
Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);


                                doc.Add(new Field("Content", content.Trim(),
Field.Store.YES, Field.Index.TOKENIZED));
                                writer.AddDocument(doc);

I'm only searching the "Content" portion not the other two.  So my questions
are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you
type in google?  

Thanks







!DSPAM:4aeb542c501105209328925!



RE: Best way to store book information

Posted by Digy <di...@gmail.com>.
1. If you want to return the field's content to the user then use
"Store.YES", otherwise no need to store it. 
In your case, "Content" can be as "Store.NO" since whole html doc is rarely
returned to the user.
2. if you want to give some "priority" to a specific field/term then use
boosting. For ex, some html pages thought to be important can be boosted.
3. Use TermEnum

DIGY


-----Original Message-----
From: Eric Advincula [mailto:Eric.Advincula@co.mohave.az.us] 
Sent: Friday, October 30, 2009 10:17 PM
To: lucene-net-user@incubator.apache.org
Subject: Best way to store book information

I have countless articles in html pages and i'm importing them and parsing
out the text only for my searching.  My question is what is the best way to
store the "Content"?
 
                                doc = new Document();
                                doc.Add(new Field("Title", title,
Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page,
Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);
 

                                doc.Add(new Field("Content", content.Trim(),
Field.Store.YES, Field.Index.TOKENIZED));
                                writer.AddDocument(doc);
 
I'm only searching the "Content" portion not the other two.  So my questions
are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you
type in google?  
 
Thanks


Best way to store book information

Posted by Eric Advincula <Er...@co.mohave.az.us>.
I have countless articles in html pages and i'm importing them and parsing out the text only for my searching.  My question is what is the best way to store the "Content"?
 
                                doc = new Document();
                                doc.Add(new Field("Title", title, Field.Store.YES, Field.Index.UN_TOKENIZED));

                                doc.Add(new Field("File", page, Field.Store.YES, Field.Index.UN_TOKENIZED));
                                content = ParseHTML(file);
 

                                doc.Add(new Field("Content", content.Trim(), Field.Store.YES, Field.Index.TOKENIZED));
                                writer.AddDocument(doc);
 
I'm only searching the "Content" portion not the other two.  So my questions are:

1.  Should I add Vectors when i save it?  If so which one
     Yes, With_Positions, With_Offsets, With_Position_Offsets
2.  Should I add boosting to this Field?
3.  What is the best way to search the content?  Something like when you type in google?  
 
Thanks

Re: Simple question

Posted by André Maldonado <an...@gmail.com>.
Hi again Franklin.

Sorry, but didn't work. I'm using Lucene.net 2.3 and doing exactly what you
said, I'm getting this error:

System.FormatException: Input string was not in correct format.
   em System.Number.StringToNumber(String str, NumberStyles options,
NumberBuffer& number, NumberFormatInfo info, Boolean parseDecimal)
   em System.Number.ParseInt32(String s, NumberStyles style,
NumberFormatInfo info)
   em
Lucene.Net.Search.FieldCacheImpl.AnonymousClassIntParser.ParseInt(String
value_Renamed)
   em
Lucene.Net.Search.FieldCacheImpl.AnonymousClassCache2.CreateValue(IndexReader
reader, Object entryKey)
   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
key)
   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
field, IntParser parser)
   em Lucene.Net.Search.FieldCacheImpl.GetInts(IndexReader reader, String
field)
   em Lucene.Net.Search.FieldSortedHitQueue.ComparatorInt(IndexReader
reader, String fieldname)
   em
Lucene.Net.Search.FieldSortedHitQueue.AnonymousClassCache.CreateValue(IndexReader
reader, Object entryKey)
   em Lucene.Net.Search.FieldCacheImpl.Cache.Get(IndexReader reader, Object
key)
   em Lucene.Net.Search.FieldSortedHitQueue.GetCachedComparator(IndexReader
reader, String field, Int32 type, CultureInfo locale, SortComparatorSource
factory)
   em Lucene.Net.Search.FieldSortedHitQueue..ctor(IndexReader reader,
SortField[] fields, Int32 size)
   em Lucene.Net.Search.IndexSearcher.Search(Weight weight, Filter filter,
Int32 nDocs, Sort sort)
   em Lucene.Net.Search.Hits.GetMoreDocs(Int32 min)
   em Lucene.Net.Search.Hits..ctor(Searcher s, Query q, Filter f, Sort o)
   em Lucene.Net.Search.Searcher.Search(Query query, Sort sort)
   em SearcherLibrary.Searcher.Search(String[] fields, String orderBy, Int32
type, Object analyzer) na
c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
252
   em SearcherLibrary.Searcher.Search(String orderBy, sortType type, Int32
hitCount) na
c:\Maldonado\projetos\BuscaBlog\Indexer\SearcherLibrary\Searcher.cs:linha
313
   em IndexerConsole.Program.Main(String[] args) na
c:\Maldonado\projetos\BuscaBlog\Indexer\IndexerConsole\Program.cs:linha 21

Any idea?

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 16:06, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> I did it again, think I'll hang it up for the day.  The correct query class
> name is 'MatchAllDocsQuery'.
>
> -----Original Message-----
> From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> Sent: Friday, October 30, 2009 2:06 PM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Simple question
>
> Oops, I'm not being very helpful.
>
> Use the MatchAllDocumentsQuery class:
>
> Searcher searcher = new IndexSearcher(directory);
>
> Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>
> Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);
>
>
> However, that may be a lot of processing.  You may want to tune the query
> in a way to minimize overhead; someone else in the list may suggest a better
> strategy.
>
>
> -----Original Message-----
> From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com]
> Sent: Friday, October 30, 2009 2:01 PM
> To: lucene-net-user@incubator.apache.org
> Subject: RE: Simple question
>
> Hi André,
>
> In this case you simply sort on the field. This may suffice:
>
> Searcher searcher = new IndexSearcher(directory);
>
> Sort = new Sort(new SortField("another_field", SortField.AUTO, false));
>
> Hits hits = searcher.search(query,sort);
>
>
> You can limit the number of hits (e.g. to 5), but I won't get into that
> here.
>
>
> Beyond SortField.AUTO, take a look at the SortField class to see specific
> field types - the most interesting being SortField.CUSTOM.
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 1:46 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Re: Simple question
>
> Hi Franklin.
>
> Wich query I use for this search (variable: query)? I don't want any query,
> I just want the TOP 5 documents ordered by a field.
>
> Thank's
>
> "Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
> verdadeiramente o Filho de Deus." (Mateus 14:33)
>
>
> On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <
> fsimmons@sccmediaserver.com
> > wrote:
>
> > You can sort a search by multiple fields.  I think you could try
> something
> > like this:
> >
> > Searcher searcher = new IndexSearcher(directory);
> > Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> > SortField("another_field") };
> > Hits hits = searcher.search(query,sort);
> >
> >
> > -----Original Message-----
> > From: André Maldonado [mailto:andre.maldonado@gmail.com]
> > Sent: Friday, October 30, 2009 12:57 PM
> > To: lucene-net-user@incubator.apache.org
> > Subject: Simple question
> >
> > Hi.
> >
> > This can be a simple question, but I can't figure out the solution.
> >
> > I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> > another_field". But this is an empty query because I want to search in
> all
> > documents.
> >
> > How can I do it?
> >
> > Thank's
> >
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
I did it again, think I'll hang it up for the day.  The correct query class name is 'MatchAllDocsQuery'.

-----Original Message-----
From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com] 
Sent: Friday, October 30, 2009 2:06 PM
To: lucene-net-user@incubator.apache.org
Subject: RE: Simple question

Oops, I'm not being very helpful.

Use the MatchAllDocumentsQuery class:

Searcher searcher = new IndexSearcher(directory);

Sort = new Sort(new SortField("another_field", SortField.AUTO, false));

Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);


However, that may be a lot of processing.  You may want to tune the query in a way to minimize overhead; someone else in the list may suggest a better strategy.


-----Original Message-----
From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com] 
Sent: Friday, October 30, 2009 2:01 PM
To: lucene-net-user@incubator.apache.org
Subject: RE: Simple question

Hi André,

In this case you simply sort on the field. This may suffice:

Searcher searcher = new IndexSearcher(directory);

Sort = new Sort(new SortField("another_field", SortField.AUTO, false));

Hits hits = searcher.search(query,sort);


You can limit the number of hits (e.g. to 5), but I won't get into that here.


Beyond SortField.AUTO, take a look at the SortField class to see specific field types - the most interesting being SortField.CUSTOM.


-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Friday, October 30, 2009 1:46 PM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

Hi Franklin.

Wich query I use for this search (variable: query)? I don't want any query,
I just want the TOP 5 documents ordered by a field.

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> You can sort a search by multiple fields.  I think you could try something
> like this:
>
> Searcher searcher = new IndexSearcher(directory);
> Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> SortField("another_field") };
> Hits hits = searcher.search(query,sort);
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 12:57 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Simple question
>
> Hi.
>
> This can be a simple question, but I can't figure out the solution.
>
> I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> another_field". But this is an empty query because I want to search in all
> documents.
>
> How can I do it?
>
> Thank's
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
Oops, I'm not being very helpful.

Use the MatchAllDocumentsQuery class:

Searcher searcher = new IndexSearcher(directory);

Sort = new Sort(new SortField("another_field", SortField.AUTO, false));

Hits hits = searcher.search(new MatchAllDocumentsQuery(),sort);


However, that may be a lot of processing.  You may want to tune the query in a way to minimize overhead; someone else in the list may suggest a better strategy.


-----Original Message-----
From: Franklin Simmons [mailto:fsimmons@sccmediaserver.com] 
Sent: Friday, October 30, 2009 2:01 PM
To: lucene-net-user@incubator.apache.org
Subject: RE: Simple question

Hi André,

In this case you simply sort on the field. This may suffice:

Searcher searcher = new IndexSearcher(directory);

Sort = new Sort(new SortField("another_field", SortField.AUTO, false));

Hits hits = searcher.search(query,sort);


You can limit the number of hits (e.g. to 5), but I won't get into that here.


Beyond SortField.AUTO, take a look at the SortField class to see specific field types - the most interesting being SortField.CUSTOM.


-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Friday, October 30, 2009 1:46 PM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

Hi Franklin.

Wich query I use for this search (variable: query)? I don't want any query,
I just want the TOP 5 documents ordered by a field.

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> You can sort a search by multiple fields.  I think you could try something
> like this:
>
> Searcher searcher = new IndexSearcher(directory);
> Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> SortField("another_field") };
> Hits hits = searcher.search(query,sort);
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 12:57 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Simple question
>
> Hi.
>
> This can be a simple question, but I can't figure out the solution.
>
> I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> another_field". But this is an empty query because I want to search in all
> documents.
>
> How can I do it?
>
> Thank's
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
Hi André,

In this case you simply sort on the field. This may suffice:

Searcher searcher = new IndexSearcher(directory);

Sort = new Sort(new SortField("another_field", SortField.AUTO, false));

Hits hits = searcher.search(query,sort);


You can limit the number of hits (e.g. to 5), but I won't get into that here.


Beyond SortField.AUTO, take a look at the SortField class to see specific field types - the most interesting being SortField.CUSTOM.


-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Friday, October 30, 2009 1:46 PM
To: lucene-net-user@incubator.apache.org
Subject: Re: Simple question

Hi Franklin.

Wich query I use for this search (variable: query)? I don't want any query,
I just want the TOP 5 documents ordered by a field.

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> You can sort a search by multiple fields.  I think you could try something
> like this:
>
> Searcher searcher = new IndexSearcher(directory);
> Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> SortField("another_field") };
> Hits hits = searcher.search(query,sort);
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 12:57 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Simple question
>
> Hi.
>
> This can be a simple question, but I can't figure out the solution.
>
> I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> another_field". But this is an empty query because I want to search in all
> documents.
>
> How can I do it?
>
> Thank's
>

Re: Simple question

Posted by André Maldonado <an...@gmail.com>.
Hi Franklin.

Wich query I use for this search (variable: query)? I don't want any query,
I just want the TOP 5 documents ordered by a field.

Thank's

"Então aproximaram-se os que estavam no barco, e adoraram-no, dizendo: És
verdadeiramente o Filho de Deus." (Mateus 14:33)


On Fri, Oct 30, 2009 at 15:03, Franklin Simmons <fsimmons@sccmediaserver.com
> wrote:

> You can sort a search by multiple fields.  I think you could try something
> like this:
>
> Searcher searcher = new IndexSearcher(directory);
> Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new
> SortField("another_field") };
> Hits hits = searcher.search(query,sort);
>
>
> -----Original Message-----
> From: André Maldonado [mailto:andre.maldonado@gmail.com]
> Sent: Friday, October 30, 2009 12:57 PM
> To: lucene-net-user@incubator.apache.org
> Subject: Simple question
>
> Hi.
>
> This can be a simple question, but I can't figure out the solution.
>
> I need to search my index in something like "SELECT TOP 5 ... ORDER BY
> another_field". But this is an empty query because I want to search in all
> documents.
>
> How can I do it?
>
> Thank's
>

RE: Simple question

Posted by Franklin Simmons <fs...@sccmediaserver.com>.
You can sort a search by multiple fields.  I think you could try something like this:

Searcher searcher = new IndexSearcher(directory);
Sort = new Sort(new SortField[] { SortField.FIELD_SCORE, new SortField("another_field") };
Hits hits = searcher.search(query,sort);


-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Friday, October 30, 2009 12:57 PM
To: lucene-net-user@incubator.apache.org
Subject: Simple question

Hi.

This can be a simple question, but I can't figure out the solution.

I need to search my index in something like "SELECT TOP 5 ... ORDER BY
another_field". But this is an empty query because I want to search in all
documents.

How can I do it?

Thank's

RE: Simple question

Posted by Digy <di...@gmail.com>.
Sorting on a large index can be very costly. I think you can use TermEnum to
get the terms of "another_field" (note that terms are stored sorted in the
index) and using the "TermDocs" you can get the corresponding docs of those
terms.

See the discusssion "Alternative to looping through Hits".

DIGY.

-----Original Message-----
From: André Maldonado [mailto:andre.maldonado@gmail.com] 
Sent: Friday, October 30, 2009 6:57 PM
To: lucene-net-user@incubator.apache.org
Subject: Simple question

Hi.

This can be a simple question, but I can't figure out the solution.

I need to search my index in something like "SELECT TOP 5 ... ORDER BY
another_field". But this is an empty query because I want to search in all
documents.

How can I do it?

Thank's