You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Armin Schleicher <sc...@gmail.com> on 2011/11/15 09:11:20 UTC

creating solr index from nutch segments, no errors, no results

hi there,

i am trying to create a fulltext index over internet archive .warc 
files. the whole procedure (as described in the following) seems to work 
fine, i do not get any errors or warnings, however there is no data 
being passed to solr, at least q=*:* returns nothing. I double checked 
the nutch scheme.xml is in the right place  and when i dump the segments 
into a textfile, all the data is there...
i create the segments using nutchwax import command from *.warc.gz files 
created by archive-it! (heritrix) and then create crawldb and linkdb 
using nutch updatedb and invertlinks commands.
here is my procedure:

*create solrindex*

    /sh /nutch-1.3/runtime/local/bin/nutch solrindex
    http://127.0.0.1:8983/solr/ /crawldb /linkdb /segments_test//


*nutch output:

*

    /SolrIndexer: starting at 2011-11-15 08:45:53
    SolrIndexer: finished at 2011-11-15 08:45:57, elapsed: 00:00:03/

*
*
*this is the resulting solr/jetty output:*

    /15.11.2011 08:45:57 org.apache.solr.update.DirectUpdateHandler2 commit
    INFO: start
    commit(optimize=false,waitFlush=true,waitSearcher=true,expungeDeletes=false)
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher <init>
    INFO: Opening Searcher@3d015a9e main
    15.11.2011 08:45:57 org.apache.solr.update.DirectUpdateHandler2 commit
    INFO: end_commit_flush
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
        
    fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming result for Searcher@3d015a9e main
        
    fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
        
    filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming result for Searcher@3d015a9e main
        
    filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
        
    queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=1,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming result for Searcher@3d015a9e main
        
    queryResultCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=1,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming Searcher@3d015a9e main from Searcher@4743bf3d main
        
    documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher warm
    INFO: autowarming result for Searcher@3d015a9e main
        
    documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57 org.apache.solr.core.QuerySenderListener newSearcher
    INFO: QuerySenderListener sending requests to Searcher@3d015a9e main
    15.11.2011 08:45:57 org.apache.solr.core.QuerySenderListener newSearcher
    INFO: QuerySenderListener done.
    15.11.2011 08:45:57 org.apache.solr.core.SolrCore registerSearcher
    INFO: [] Registered new searcher Searcher@3d015a9e main
    15.11.2011 08:45:57 org.apache.solr.search.SolrIndexSearcher close
    INFO: Closing Searcher@4743bf3d main
        
    fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
        
    filterCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
        
    queryResultCache{lookups=1,hits=0,hitratio=0.00,inserts=2,evictions=0,size=2,warmupTime=0,cumulative_lookups=1,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=1,cumulative_evictions=0}
        
    documentCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
    15.11.2011 08:45:57
    org.apache.solr.update.processor.LogUpdateProcessor finish
    INFO: {commit=} 0 48
    15.11.2011 08:45:57 org.apache.solr.core.SolrCore execute
    INFO: [] webapp=/solr path=/update
    params={waitSearcher=true&waitFlush=true&wt=javabin&commit=true&version=2}
    status=0 QTime=48

    /

sorry for double posting this on the nutch and the solr mailing list, 
but i dont really know which of the two is causing this problem...
any hints will be highly appreciated!
bests, armin

    /


    /




Re: creating solr index from nutch segments, no errors, no results

Posted by Michael Kuhlmann <ku...@solarier.de>.
I don't know much about nutch, but it looks like there's simply a commit 
missing at the end.

Try to send a commit, e.g  by executing

curl http://host:port/solr/<core>/update -H "Content-Type: text/xml" 
--data-binary '<commit />'

-Kuli

Am 15.11.2011 09:11, schrieb Armin Schleicher:
> hi there,
[...]