You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Ravi <ra...@gmail.com> on 2008/08/13 15:44:56 UTC

problem in the sorting the results from lucene.

Hi, i am facing some problems in sorting the results from lucene.
I want to sort the result according to the date of an object. I have used
"EDITDATE" field for this purpose as shown below:

doc->Add(new
Field("EDITDATE",ix->getEditDate(s)->ToString(),Lucene::Net::Documents::Field::Store::NO,Lucene::Net::Documents::Field::Index::UN_TOKENIZED));

I have read in the 'Lucene in Action' that, the field which we are going to
use for sorting must be indexed but UN_TOKENIZED. But i get the results in
random order...

I am using following code in the search app:

Query *query = qp->Parse(querydata->ToString());
BooleanQuery *boolquery = new BooleanQuery();
boolquery->Add(query,BooleanClause::Occur::MUST);
Sort *sort = new Sort("EDITDATE",true);
Hits *hits = searcher->Search(boolquery,sort);

So can you please suggest, what is wrong in this example?

-- 
Thanks & Regards...
Ravindra V. Gaikwad.
S.E., Colayer Gmbh.

Re: System.Intern(), memory leak, and sorting

Posted by Min Yin <yi...@AI.SRI.COM>.
In case I've completely confused people, it should be String.Intern() 
not System.Intern() I was talking about (don't know what I was thinking, 
probably the lunch :-P). It seems when using String.Intern(s) with the 
original source code, the comparison is done with (System.Object), but 
for s itself, the comparison should be done with (System.String) instead.

Min

Min Yin wrote:
> Thanks Digy, we've already applied that patch. We are replacing the 
> System.Intern() to try to stop additional leakage.
>
> Min
>
> Digy wrote:
>> Hi Min,
>>
>> Have you read http://issues.apache.org/jira/browse/LUCENENET-106?
>> Doesn't it solve the leakage problem(WeakHashTable+FieldCacheImpl.rar)?.
>>
>> DIGY
>>
>>
>> -----Original Message-----
>> From: Min Yin [mailto:yin@AI.SRI.COM] Sent: Tuesday, August 19, 2008 
>> 9:42 PM
>> To: lucene-net-user@incubator.apache.org
>> Subject: System.Intern(), memory leak, and sorting
>>
>> Hi there,
>>
>> In our attempt to fix the memory leaking (we are using v2.1.0), we 
>> replace all the System.Intern(s) with s itself in the source code. 
>> The theory being System.Intern(s) will cause the system to hold s 
>> forever without releasing it. That (with a WeakHashTable) seems to 
>> correct the leaking problem but then we found sorting of the returned 
>> result broken, and the sorting begin to work immediately after we 
>> switch back to System.Intern(s).
>>
>> I'm looking into the source code to understand why is so, any insight 
>> how to deal with the situation so we can keep the memory intact and 
>> the sort working? I also wonder if the new version (v2.3.0 or ?) will 
>> fix the memory leak so we don't need to change the source code?
>>
>> Many Thanks in advance!
>> Min
>>   


Re: System.Intern(), memory leak, and sorting

Posted by Min Yin <yi...@AI.SRI.COM>.
Thanks Digy, we've already applied that patch. We are replacing the 
System.Intern() to try to stop additional leakage.

Min

Digy wrote:
> Hi Min,
>
> Have you read http://issues.apache.org/jira/browse/LUCENENET-106?
> Doesn't it solve the leakage problem(WeakHashTable+FieldCacheImpl.rar)?.
>
> DIGY
>
>
> -----Original Message-----
> From: Min Yin [mailto:yin@AI.SRI.COM] 
> Sent: Tuesday, August 19, 2008 9:42 PM
> To: lucene-net-user@incubator.apache.org
> Subject: System.Intern(), memory leak, and sorting
>
> Hi there,
>
> In our attempt to fix the memory leaking (we are using v2.1.0), we 
> replace all the System.Intern(s) with s itself in the source code. The 
> theory being System.Intern(s) will cause the system to hold s forever 
> without releasing it. That (with a WeakHashTable) seems to correct the 
> leaking problem but then we found sorting of the returned result broken, 
> and the sorting begin to work immediately after we switch back to 
> System.Intern(s).
>
> I'm looking into the source code to understand why is so, any insight 
> how to deal with the situation so we can keep the memory intact and the 
> sort working? I also wonder if the new version (v2.3.0 or ?) will fix 
> the memory leak so we don't need to change the source code?
>
> Many Thanks in advance!
> Min
>   


RE: System.Intern(), memory leak, and sorting

Posted by Digy <di...@gmail.com>.
Hi Min,

Have you read http://issues.apache.org/jira/browse/LUCENENET-106?
Doesn't it solve the leakage problem(WeakHashTable+FieldCacheImpl.rar)?.

DIGY


-----Original Message-----
From: Min Yin [mailto:yin@AI.SRI.COM] 
Sent: Tuesday, August 19, 2008 9:42 PM
To: lucene-net-user@incubator.apache.org
Subject: System.Intern(), memory leak, and sorting

Hi there,

In our attempt to fix the memory leaking (we are using v2.1.0), we 
replace all the System.Intern(s) with s itself in the source code. The 
theory being System.Intern(s) will cause the system to hold s forever 
without releasing it. That (with a WeakHashTable) seems to correct the 
leaking problem but then we found sorting of the returned result broken, 
and the sorting begin to work immediately after we switch back to 
System.Intern(s).

I'm looking into the source code to understand why is so, any insight 
how to deal with the situation so we can keep the memory intact and the 
sort working? I also wonder if the new version (v2.3.0 or ?) will fix 
the memory leak so we don't need to change the source code?

Many Thanks in advance!
Min


System.Intern(), memory leak, and sorting

Posted by Min Yin <yi...@AI.SRI.COM>.
Hi there,

In our attempt to fix the memory leaking (we are using v2.1.0), we 
replace all the System.Intern(s) with s itself in the source code. The 
theory being System.Intern(s) will cause the system to hold s forever 
without releasing it. That (with a WeakHashTable) seems to correct the 
leaking problem but then we found sorting of the returned result broken, 
and the sorting begin to work immediately after we switch back to 
System.Intern(s).

I'm looking into the source code to understand why is so, any insight 
how to deal with the situation so we can keep the memory intact and the 
sort working? I also wonder if the new version (v2.3.0 or ?) will fix 
the memory leak so we don't need to change the source code?

Many Thanks in advance!
Min

RE: problem in the sorting the results from lucene.

Posted by Digy <di...@gmail.com>.
If your date field is not in the form something like 'YYYYMMDD HHmmss' your
sort will be wrong(since it is a string comparison).

DIGY

-----Original Message-----
From: Granroth, Neal V. [mailto:neal.granroth@thermofisher.com] 
Sent: Wednesday, August 13, 2008 10:07 PM
To: lucene-net-user@incubator.apache.org
Subject: RE: problem in the sorting the results from lucene.

In your search app, the BooleanQuery object is unnecessary as you have only
added one query to it.  Try this instead:

Hits *hits = searcher->Search(query,sort);

I am not very confident that this is the problem, but there is a possiblity
that the BooleanQuery, as you constructed, it is confusing sort.

If that does not cure the problem then I would look at the index
construction code.
The content of the "EDITDATE" field must be a string, integer, or float
(according to "Lucene in Action" for Lucene  version 1.9.1).

I notice that you are flagging the field as "Store::NO", it is possible that
this needs to be "Store::Yes".

In my application, all of the fields for which I have successfully used sort
are added to the document using C# statements similar to this:

doc.Add( Field.Keyword("filetype", sType ) );

Hope this provides some clues to a solution.
-- Neal


-----Original Message-----
From: Ravi [mailto:ravindragaikwad@gmail.com]
Sent: Wednesday, August 13, 2008 8:45 AM
To: lucene-net-user@incubator.apache.org
Subject: problem in the sorting the results from lucene.

Hi, i am facing some problems in sorting the results from lucene.
I want to sort the result according to the date of an object. I have used
"EDITDATE" field for this purpose as shown below:

doc->Add(new
Field("EDITDATE",ix->getEditDate(s)->ToString(),Lucene::Net::Documents::Fiel
d::Store::NO,Lucene::Net::Documents::Field::Index::UN_TOKENIZED));

I have read in the 'Lucene in Action' that, the field which we are going to
use for sorting must be indexed but UN_TOKENIZED. But i get the results in
random order...

I am using following code in the search app:

Query *query = qp->Parse(querydata->ToString());
BooleanQuery *boolquery = new BooleanQuery();
boolquery->Add(query,BooleanClause::Occur::MUST);
Sort *sort = new Sort("EDITDATE",true);
Hits *hits = searcher->Search(boolquery,sort);

So can you please suggest, what is wrong in this example?

--
Thanks & Regards...
Ravindra V. Gaikwad.
S.E., Colayer Gmbh.


RE: problem in the sorting the results from lucene.

Posted by "Granroth, Neal V." <ne...@thermofisher.com>.
In your search app, the BooleanQuery object is unnecessary as you have only added one query to it.  Try this instead:

Hits *hits = searcher->Search(query,sort);

I am not very confident that this is the problem, but there is a possiblity that the BooleanQuery, as you constructed, it is confusing sort.

If that does not cure the problem then I would look at the index construction code.
The content of the "EDITDATE" field must be a string, integer, or float (according to "Lucene in Action" for Lucene  version 1.9.1).

I notice that you are flagging the field as "Store::NO", it is possible that this needs to be "Store::Yes".

In my application, all of the fields for which I have successfully used sort are added to the document using C# statements similar to this:

doc.Add( Field.Keyword("filetype", sType ) );

Hope this provides some clues to a solution.
-- Neal


-----Original Message-----
From: Ravi [mailto:ravindragaikwad@gmail.com]
Sent: Wednesday, August 13, 2008 8:45 AM
To: lucene-net-user@incubator.apache.org
Subject: problem in the sorting the results from lucene.

Hi, i am facing some problems in sorting the results from lucene.
I want to sort the result according to the date of an object. I have used
"EDITDATE" field for this purpose as shown below:

doc->Add(new
Field("EDITDATE",ix->getEditDate(s)->ToString(),Lucene::Net::Documents::Field::Store::NO,Lucene::Net::Documents::Field::Index::UN_TOKENIZED));

I have read in the 'Lucene in Action' that, the field which we are going to
use for sorting must be indexed but UN_TOKENIZED. But i get the results in
random order...

I am using following code in the search app:

Query *query = qp->Parse(querydata->ToString());
BooleanQuery *boolquery = new BooleanQuery();
boolquery->Add(query,BooleanClause::Occur::MUST);
Sort *sort = new Sort("EDITDATE",true);
Hits *hits = searcher->Search(boolquery,sort);

So can you please suggest, what is wrong in this example?

--
Thanks & Regards...
Ravindra V. Gaikwad.
S.E., Colayer Gmbh.