You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Marcus Stratmann <st...@gmx.de> on 2006/05/01 11:37:28 UTC

Re: Java heap space

Yonik Seeley wrote:
>> Yes, on a delete operation. I'm not doing any commits until the end of
>> all delete operations.
> I assume this is a delete-by-id and not a delete-by-query?  They work
> very differently.

Yes, all queries are delete-by-id.

> If you are first deleting so you can re-add a newer version of the
> document, you don't need too... overwriting older documents based on
> the uniqueKeyField is something Solr does for you!

Yes, I know. But the articles in our (sql-)database get new IDs when 
they are changed so they need to be deleted an re-inserted into the index.

> Is it possible to use a profiler to see where all the memory is going?
> It sounds like you may have uncovered a memory leak somewhere.

I'm not that experienced concerning Java, but maybe if you give me some 
advice I'm glad if I can help. So far I had a quick look at JMP once but 
that's all.
Don't hesitate to write me a PM on that subject.

> Also what OS, what JVM, what appserver are you using?
OS: Linux (Debian GNU/Linux i686)
JVM: Java HotSpot(TM) Server VM (build 1.5.0_06-b05, mixed mode) of 
Sun's JDK 5.0.
Currently I'm using the Jetty installation from the solr nightly builds 
for test purposes.

Marcus

Re: Java heap space

Posted by Marcus Stratmann <st...@gmx.de>.

Chris Hostetter wrote:
> this is off the subject of the heap space issue ... but if the id changes,
> then maybe it shouldn't be the uniqueId of your index? .. your code must
> have someone of recognizing that article B with id 222 is a changed
> version of article A with id 111 (otherwise how would you know to delete
> 111 when you insert 222?) ..whatever that mechanism is, perhaps it should
> determine your uniqueKey?

No, there is no "key" or something that reveals a relation between new
article B and old article A. After B is inserted and A is deleted, all
of A's existence is gone and we do not even know that B is A's
"successor". Changes are simply kept in a table which tells the system
which IDs to delete and which new (or changed) articles to insert,
automatically giving them new IDs. I know this may not be (or at least
sound) perfect and it is not the way things are handled normally. But
this works fine for our needs. We gather information about changes to
our data during the day and apply them on a nightly update (which, I
know, does not imply that IDs have to change).
So, yes, I'm sure I got the right uniqueKey. ;-)

Marcus

Re: Java heap space

Posted by Chris Hostetter <ho...@fucit.org>.

: > If you are first deleting so you can re-add a newer version of the
: > document, you don't need too... overwriting older documents based on
: > the uniqueKeyField is something Solr does for you!
:
: Yes, I know. But the articles in our (sql-)database get new IDs when
: they are changed so they need to be deleted an re-inserted into the index.

this is off the subject of the heap space issue ... but if the id changes,
then maybe it shouldn't be the uniqueId of your index? .. your code must
have someone of recognizing that article B with id 222 is a changed
version of article A with id 111 (otherwise how would you know to delete
111 when you insert 222?) ..whatever that mechanism is, perhaps it should
determine your uniqueKey?


-Hoss