You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andrew Nagy <an...@villanova.edu> on 2006/12/05 23:45:38 UTC

Initial import problems

Hello, I am new to SOLR but very excited for it's possibilities.

I am having some difficulties with my data import which I hope can be 
solved very easily.
First I wrote an xslt to transform my xml into the solr schema and 
modified the schema.xml to match the fields that I created.  I then ran 
the post.sh on my 492,000 records that I have.  Near the end of the 
process the records stopped being added due to a memory heap error.  I 
obviously maxed the allotted memory for the import.  Next time I will 
import less at a time!

I then posted a commit statement.  I went to my solr admin site and 
looked at the statistics.  It said 372,000 records (roughly) were stored 
and 1 commit.  I tried to do a search but no matter what I search for I 
get 0 results.  I even tried title:"the" (assuming it is not blocking 
the stop word, it should return something!).

It appears to me that the search is not searching any records.  Any idea 
as to what I might need to do, or should I start over from scratch and 
re-import my records in smaller chunks?

Thanks!
Andrew

Re: Initial import problems

Posted by Gmail Account <ma...@gmail.com>.
I'm having slow performance with my solr index. I'm not sure what to do. I 
need some suggestions on what to try. I have updated all my records in the 
last couple of days. I'm not sure how much it degraded because of that, but 
it now takes about 3 seconds per search. My cache statistics don't look so 
good either.

Also... I'm not sure I was supposed to do a couple of things.
    - I did an optimize index through Luke with compound format and noticed 
in the solrconfig file that useCompoundFile is set to false.
    - I changed one of the fields in the schema from text_ws to string
    - I added a field (type="text" indexed="false" stored="true")

My schema and solrconfig are the same as the example except I have a few 
more fields. My pc is winXP and has 2gig of ram. Below are some stats from 
the solr admin stat page.

Thanks!


caching : true
numDocs : 1185814
maxDoc : 2070472
readerImpl : MultiReader

      name:  filterCache
      class:  org.apache.solr.search.LRUCache
      version:  1.0
      description:  LRU Cache(maxSize=512, initialSize=512, 
autowarmCount=256, 
regenerator=org.apache.solr.search.SolrIndexSearcher$1@d55986)
      stats:  lookups : 658446
      hits : 30
      hitratio : 0.00
      inserts : 658420
      evictions : 657908
      size : 512
      cumulative_lookups : 658446
      cumulative_hits : 30
      cumulative_hitratio : 0.00
      cumulative_inserts : 658420
      cumulative_evictions : 657908


      name:  queryResultCache
      class:  org.apache.solr.search.LRUCache
      version:  1.0
      description:  LRU Cache(maxSize=512, initialSize=512, 
autowarmCount=256, 
regenerator=org.apache.solr.search.SolrIndexSearcher$2@1b4c1d7)
      stats:  lookups : 88
      hits : 83
      hitratio : 0.94
      inserts : 6
      evictions : 0
      size : 5
      cumulative_lookups : 88
      cumulative_hits : 83
      cumulative_hitratio : 0.94
      cumulative_inserts : 6
      cumulative_evictions : 0


      name:  documentCache
      class:  org.apache.solr.search.LRUCache
      version:  1.0
      description:  LRU Cache(maxSize=512, initialSize=512)
      stats:  lookups : 780
      hits : 738
      hitratio : 0.94
      inserts : 42
      evictions : 0
      size : 42
      cumulative_lookups : 780
      cumulative_hits : 738
      cumulative_hitratio : 0.94
      cumulative_inserts : 42
      cumulative_evictions : 0

 


Re: Initial import problems

Posted by Mike Klaas <mi...@gmail.com>.
On 12/5/06, Andrew Nagy <an...@villanova.edu> wrote:
> Hello, I am new to SOLR but very excited for it's possibilities.
>
> I am having some difficulties with my data import which I hope can be
> solved very easily.
> First I wrote an xslt to transform my xml into the solr schema and
> modified the schema.xml to match the fields that I created.  I then ran
> the post.sh on my 492,000 records that I have.  Near the end of the
> process the records stopped being added due to a memory heap error.  I
> obviously maxed the allotted memory for the import.  Next time I will
> import less at a time!

Yeah, committing more frequently should help this case.

> I then posted a commit statement.  I went to my solr admin site and
> looked at the statistics.  It said 372,000 records (roughly) were stored
> and 1 commit.  I tried to do a search but no matter what I search for I
> get 0 results.  I even tried title:"the" (assuming it is not blocking
> the stop word, it should return something!).

The schema in the example does include a stop word filter--are you
sure that you aren't blocking stop words?

-MIke

Re: Initial import problems

Posted by Yonik Seeley <yo...@apache.org>.
On 12/5/06, Andrew Nagy <an...@villanova.edu> wrote:
> Hello, I am new to SOLR but very excited for it's possibilities.
>
> I am having some difficulties with my data import which I hope can be
> solved very easily.
> First I wrote an xslt to transform my xml into the solr schema and
> modified the schema.xml to match the fields that I created.  I then ran
> the post.sh on my 492,000 records that I have.  Near the end of the
> process the records stopped being added due to a memory heap error.  I
> obviously maxed the allotted memory for the import.  Next time I will
> import less at a time!

Did you increase the JVM heap size?

> I then posted a commit statement.

Correct operation of the server after an OOM exception isn't really
guaranteed (the excpetion  may happen in any thread, in any library,
including that of the app server).

>  I went to my solr admin site and
> looked at the statistics.  It said 372,000 records (roughly) were stored
> and 1 commit.  I tried to do a search but no matter what I search for I
> get 0 results.  I even tried title:"the" (assuming it is not blocking
> the stop word, it should return something!).
>
> It appears to me that the search is not searching any records.  Any idea
> as to what I might need to do, or should I start over from scratch and
> re-import my records in smaller chunks?

That might help, but may not be sufficient if you don't have enough heap memory.

-Yonik