You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by DECAFFMEYER MATHIEU <MA...@fortis.lu> on 2007/03/13 17:23:44 UTC

[Urgent] deleteDocuments fails after merging ...

Hi,

I have put this question as "urgent" because I can notice I don't have
often answers,
If I'm asking the wrong way, please tell me...


Before I delete a document I search it in the index to be sure there is
a hit (via a Term object),
When I find a hit I delete the document (with the same Term object),

But there is something very odd happing sometimes :

I find a hit, but the integer returned by deleteDocuments is equal to 0
This often happens after I merged the index for the first time,
it never happens before I merge ...



Term urlTerm = new Term("url", urlToDel);
Query query = new TermQuery(urlTerm);

Hits hits = search(query);
if (hits.length() > 0) {
	if (hits.length() > 1) {
		System.out.println("found in the index with
duplicates");
	}
	System.out.println("found in the index");
	try {
		int nb = mIndexReaderClone.deleteDocuments(urlTerm);
		if (nb > 0)
			System.out.println("successfully deleted");
		else
			throw new IOException("0 doc deleted");
	} catch (IOException e) {
		e.printStackTrace();
		throw new Exception(
			Thread.currentThread().getName() + " ---
Deleting old entry failed",
			e);
	}


What I get sometimes is :

found in the index
0 doc deleted


__________________________________

   Matt



============================================
Internet communications are not secure and therefore Fortis Banque Luxembourg S.A. does not accept legal responsibility for the contents of this message. The information contained in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Nothing in the message is capable or intended to create any legally binding obligations on either party and it is not intended to provide legal advice.
============================================


Re: [Urgent] deleteDocuments fails after merging ...

Posted by Chris Hostetter <ho...@fucit.org>.
: > the only real reason you should really need 2 searchers at a time is if
: > you are searching other queries in parallel threads at the same time ...
: > or if you are warming up one new searcher that's "ondeck" while still
: > serving queries with an older searcher.
:
: Hoss, I hope I misunderstood this: are you saying that the same
: IndexSearcher/IndexReader pair can not be used concurrently against a single
: index by different threads executing different queries?

no i'm saying the only reason you need two searhsers are:

  1) if, seperate from the searcher you are using to deletes (which you
seem to have a use case that involves reopening to check the deletes) you
also wnat a searcher open continuously which you use to search search
clients.

  2) if, for performance reasons, when opening a new searcher to expose a
new version of hte index, you want to open the new one, warm it up with
some queries, and only then direct new threads to the new searcher
and close the old searcher.




-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: [Urgent] deleteDocuments fails after merging ...

Posted by Antony Bowesman <ad...@teamware.com>.
Chris Hostetter wrote:
> the only real reason you should really need 2 searchers at a time is if
> you are searching other queries in parallel threads at the same time ...
> or if you are warming up one new searcher that's "ondeck" while still
> serving queries with an older searcher.

Hoss, I hope I misunderstood this: are you saying that the same 
IndexSearcher/IndexReader pair can not be used concurrently against a single 
index by different threads executing different queries?

The archives have several mentions of sharing IndexSearcher among threads and 
Otis says http://www.jguru.com/faq/view.jsp?EID=492393.

Can you clarify what you meant please.

Antony




---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: [Urgent] deleteDocuments fails after merging ...

Posted by Chris Hostetter <ho...@fucit.org>.
: I just have two IndexSearchers opened now most of the time, which is
: deprecated,
: But I think that's my only choice !

2 searchers is fine ... it's "N" where N is not bound that you want to
avoid.

from what i understand of your requirements, you don't *really* need two
searchers open ... open a searcher to do whatever complex queries you need
to get the docIds to delete, then delete
them all, then close/reopen searcher (and check that the delets worked if
you don't trust it)

the only real reason you should really need 2 searchers at a time is if
you are searching other queries in parallel threads at the same time ...
or if you are warming up one new searcher that's "ondeck" while still
serving queries with an older searcher.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: [Urgent] deleteDocuments fails after merging ...

Posted by DECAFFMEYER MATHIEU <MA...@fortis.lu>.
>> Is that the same reader that is used in IndexSearcher?

I opened an IndexSearcher on the path (String) to the index.
Now I tried to open on the clone IndexReader and use the constructor
that has an IndexReader as param, 
and I got everything working now ....

I just have two IndexSearchers opened now most of the time, which is
deprecated,
But I think that's my only choice !

Thank u !

__________________________________
   Matt

    

-----Original Message-----
From: Antony Bowesman [mailto:adb@teamware.com] 
Sent: Tuesday, March 13, 2007 10:13 PM
To: java-user@lucene.apache.org
Subject: Re: [Urgent] deleteDocuments fails after merging ...

*****  This message comes from the Internet Network *****

Erick Erickson wrote:
> The javadocs point out that this line
> 
> * int* nb = mIndexReaderClone.deleteDocuments(urlTerm)
> 
> removes*all* documents for a given term. So of course you'll fail
> to delete any documents the second time you call
> deleteDocuments with the same term.

Isn't the code snippet below doing a search before attempting the
deletion, so 
from the IndexReader's point of view (as used by the IndexSearcher) the
item 
exists.  What is mIndexReaderClone?  Is that the same reader that is
used in 
IndexSearcher?

I'm not sure, but if you search with one IndexReader and delete the
document 
using another IndexReader and then repeat the process, I think that the
search 
would still result in a hit, but the deletion would return 0.

> On 3/13/07, DECAFFMEYER MATHIEU <MA...@fortis.lu> wrote:
>>
>> Before I delete a document I search it in the index to be sure there
is a
>> hit (via a Term object),
>> When I find a hit I delete the document (with the same Term object),

>> Hits hits = search(query);
>> *if* (hits.length() > 0) {
>>        * if* (hits.length() > 1) {
>>                 System.out.println("found in the index with
duplicates");
>>         }
>>         System.out.println("found in the index");
>>        * try* {
>>                * int* nb =
mIndexReaderClone.deleteDocuments(urlTerm);
>>                * if* (nb > 0)
>>                         System.out.println("successfully deleted");
>>                * else*
>>                        * throw** new* IOException("0 doc deleted");
>>         }* catch* (IOException e) {
>>                 e.printStackTrace();
>>                * throw** new* Exception(
>>                         Thread.currentThread().getName() + " ---
Deleting

Antony


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



============================================
Internet communications are not secure and therefore Fortis Banque Luxembourg S.A. does not accept legal responsibility for the contents of this message. The information contained in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Nothing in the message is capable or intended to create any legally binding obligations on either party and it is not intended to provide legal advice.
============================================


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: [Urgent] deleteDocuments fails after merging ...

Posted by Antony Bowesman <ad...@teamware.com>.
Erick Erickson wrote:
> The javadocs point out that this line
> 
> * int* nb = mIndexReaderClone.deleteDocuments(urlTerm)
> 
> removes*all* documents for a given term. So of course you'll fail
> to delete any documents the second time you call
> deleteDocuments with the same term.

Isn't the code snippet below doing a search before attempting the deletion, so 
from the IndexReader's point of view (as used by the IndexSearcher) the item 
exists.  What is mIndexReaderClone?  Is that the same reader that is used in 
IndexSearcher?

I'm not sure, but if you search with one IndexReader and delete the document 
using another IndexReader and then repeat the process, I think that the search 
would still result in a hit, but the deletion would return 0.

> On 3/13/07, DECAFFMEYER MATHIEU <MA...@fortis.lu> wrote:
>>
>> Before I delete a document I search it in the index to be sure there is a
>> hit (via a Term object),
>> When I find a hit I delete the document (with the same Term object),

>> Hits hits = search(query);
>> *if* (hits.length() > 0) {
>>        * if* (hits.length() > 1) {
>>                 System.out.println("found in the index with duplicates");
>>         }
>>         System.out.println("found in the index");
>>        * try* {
>>                * int* nb = mIndexReaderClone.deleteDocuments(urlTerm);
>>                * if* (nb > 0)
>>                         System.out.println("successfully deleted");
>>                * else*
>>                        * throw** new* IOException("0 doc deleted");
>>         }* catch* (IOException e) {
>>                 e.printStackTrace();
>>                * throw** new* Exception(
>>                         Thread.currentThread().getName() + " --- Deleting

Antony


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: [Urgent] deleteDocuments fails after merging ...

Posted by DECAFFMEYER MATHIEU <MA...@fortis.lu>.
Thank u Erick,

I'll look more into docs to check why I get a search result and no
deletion ...

could have been less rude to me though ...
I feel a very mean person now :-( 

anyway thank u for your time

__________________________________

   Matt

    

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Tuesday, March 13, 2007 5:51 PM
To: java-user@lucene.apache.org
Subject: Re: [Urgent] deleteDocuments fails after merging ...

*****  This message comes from the Internet Network *****

Well, don't label things urgent. Since this forum is is free, you have
no right to demand a quick response.

You'd get better responses if there was some evidence that you
actually tried to find answers to your questions before posting
them. We all have other duties, and taking time out to answer
what look like questions that reflect minimum effort on your part
gets a very low priority.

Even a casual look at your code reveals a major problem, more
evidence that you aren't doing very much work before yelling
for help, thus going on the low-priority list.

The javadocs point out that this line

* int* nb = mIndexReaderClone.deleteDocuments(urlTerm)

removes*all* documents for a given term. So of course you'll fail
to delete any documents the second time you call
deleteDocuments with the same term.

Which is further evidence that your statement
"it never happens before I merge" shows that you haven't, for instance,
stepped through your code in a debugger trying to figure out what's
going on. If you had, you'd have noticed that the code always failed
on the second call. Which may have lead you to the answer for
yourself.

My overall point is that you seem to be treating this forum
as your personal help desk, with nothing better to do than
answer your questions as soon as they occur to you. This is
certainly the way many of your posts strike me.

Someone posted this URL a while back. While I think it's far
ruder than necessary, it does provide insights into asking for
help from free forums. I think you would get some valuable
information from it.

http://www.linuxforums.org/forum/linux-newbie/6322-asking-good-questions
-2-a.html

Erick




On 3/13/07, DECAFFMEYER MATHIEU <MA...@fortis.lu> wrote:
>
>
> Hi,
>
> I have put this question as "urgent" because I can notice I don't have
> often answers,
> If I'm asking the wrong way, please tell me...
>
> Before I delete a document I search it in the index to be sure there
is a
> hit (via a Term object),
> When I find a hit I delete the document (with the same Term object),
>
> But there is something very odd happing sometimes :
>
> I find a hit, but the integer returned by deleteDocuments is equal to
0
> This often happens after I merged the index for the first time,
> it never happens before I merge ...
>
>
> Term urlTerm =* **new* Term("url", urlToDel);
> Query query =* new* TermQuery(urlTerm);
>
> Hits hits = search(query);
> *if* (hits.length() > 0) {
>        * if* (hits.length() > 1) {
>                 System.out.println("found in the index with
duplicates");
>         }
>         System.out.println("found in the index");
>        * try* {
>                * int* nb = mIndexReaderClone.deleteDocuments(urlTerm);
>                * if* (nb > 0)
>                         System.out.println("successfully deleted");
>                * else*
>                        * throw** new* IOException("0 doc deleted");
>         }* catch* (IOException e) {
>                 e.printStackTrace();
>                * throw** new* Exception(
>                         Thread.currentThread().getName() + " ---
Deleting
> old entry failed",
>                         e);
>         }
>
> What I get sometimes is :
>
> found in the index
> 0 doc deleted
>
> *__________________________________*
>
> *   Matt*******
>
>
> ============================================
> Internet communications are not secure and therefore Fortis Banque
> Luxembourg S.A. does not accept legal responsibility for the contents
of
> this message. The information contained in this e-mail is confidential
and
> may be legally privileged. It is intended solely for the addressee. If
you
> are not the intended recipient, any disclosure, copying, distribution
or any
> action taken or omitted to be taken in reliance on it, is prohibited
and may
> be unlawful. Nothing in the message is capable or intended to create
any
> legally binding obligations on either party and it is not intended to
> provide legal advice.
> ============================================
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


============================================
Internet communications are not secure and therefore Fortis Banque Luxembourg S.A. does not accept legal responsibility for the contents of this message. The information contained in this e-mail is confidential and may be legally privileged. It is intended solely for the addressee. If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful. Nothing in the message is capable or intended to create any legally binding obligations on either party and it is not intended to provide legal advice.
============================================


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: [Urgent] deleteDocuments fails after merging ...

Posted by Erick Erickson <er...@gmail.com>.
Well, don't label things urgent. Since this forum is is free, you have
no right to demand a quick response.

You'd get better responses if there was some evidence that you
actually tried to find answers to your questions before posting
them. We all have other duties, and taking time out to answer
what look like questions that reflect minimum effort on your part
gets a very low priority.

Even a casual look at your code reveals a major problem, more
evidence that you aren't doing very much work before yelling
for help, thus going on the low-priority list.

The javadocs point out that this line

* int* nb = mIndexReaderClone.deleteDocuments(urlTerm)

removes*all* documents for a given term. So of course you'll fail
to delete any documents the second time you call
deleteDocuments with the same term.

Which is further evidence that your statement
"it never happens before I merge" shows that you haven't, for instance,
stepped through your code in a debugger trying to figure out what's
going on. If you had, you'd have noticed that the code always failed
on the second call. Which may have lead you to the answer for
yourself.

My overall point is that you seem to be treating this forum
as your personal help desk, with nothing better to do than
answer your questions as soon as they occur to you. This is
certainly the way many of your posts strike me.

Someone posted this URL a while back. While I think it's far
ruder than necessary, it does provide insights into asking for
help from free forums. I think you would get some valuable
information from it.

http://www.linuxforums.org/forum/linux-newbie/6322-asking-good-questions-2-a.html

Erick




On 3/13/07, DECAFFMEYER MATHIEU <MA...@fortis.lu> wrote:
>
>
> Hi,
>
> I have put this question as "urgent" because I can notice I don't have
> often answers,
> If I'm asking the wrong way, please tell me...
>
> Before I delete a document I search it in the index to be sure there is a
> hit (via a Term object),
> When I find a hit I delete the document (with the same Term object),
>
> But there is something very odd happing sometimes :
>
> I find a hit, but the integer returned by deleteDocuments is equal to 0
> This often happens after I merged the index for the first time,
> it never happens before I merge ...
>
>
> Term urlTerm =* **new* Term("url", urlToDel);
> Query query =* new* TermQuery(urlTerm);
>
> Hits hits = search(query);
> *if* (hits.length() > 0) {
>        * if* (hits.length() > 1) {
>                 System.out.println("found in the index with duplicates");
>         }
>         System.out.println("found in the index");
>        * try* {
>                * int* nb = mIndexReaderClone.deleteDocuments(urlTerm);
>                * if* (nb > 0)
>                         System.out.println("successfully deleted");
>                * else*
>                        * throw** new* IOException("0 doc deleted");
>         }* catch* (IOException e) {
>                 e.printStackTrace();
>                * throw** new* Exception(
>                         Thread.currentThread().getName() + " --- Deleting
> old entry failed",
>                         e);
>         }
>
> What I get sometimes is :
>
> found in the index
> 0 doc deleted
>
> *__________________________________*
>
> *   Matt*******
>
>
> ============================================
> Internet communications are not secure and therefore Fortis Banque
> Luxembourg S.A. does not accept legal responsibility for the contents of
> this message. The information contained in this e-mail is confidential and
> may be legally privileged. It is intended solely for the addressee. If you
> are not the intended recipient, any disclosure, copying, distribution or any
> action taken or omitted to be taken in reliance on it, is prohibited and may
> be unlawful. Nothing in the message is capable or intended to create any
> legally binding obligations on either party and it is not intended to
> provide legal advice.
> ============================================
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>