You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jonatan Fournier <jo...@gmail.com> on 2012/08/13 17:11:55 UTC

Index not loading

Hi,

I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.

Within my SolrJ application, the documents are added to the server
using the commitWithin parameter (in my case 60s). After 1 day my 125
millions document are all added to the server and I can see 89G of
index data files. I stop my SolrJ application and reload my Solr
instance in Tomcat.

>From the Solr admin panel related to my Core (collection1) I see this info:


Last Modified:
Num Docs:0
Max Doc:0
Version:1
Segment Count:0
Optimized: (green check)
Current:  (green check)
Master:	
Version: 0
Gen: 1
Size: 88.14 GB


>From the general Core Admin panel I see:

lastModified:
version:1
numDocs:0
maxDoc:0
optimized: (red circle)
current: (green check)
hasDeletions: (red circle)

If I query my index for *:* I get 0 result. If I trigger optimize it
wipes ALL my data inside the index and reset to empty. I've played
around my EmbeddedServer initially using autoCommit/softCommit and it
was working fine. Now that I've switched to commitWithin the document
add query, it always do that! I'm never able to reload my index within
Tomcat/Solr.

Any idea?

Cheers,

/jonathan

Re: Index not loading

Posted by Jonatan Fournier <jo...@gmail.com>.
On Tue, Aug 14, 2012 at 5:37 PM, Jonatan Fournier
<jo...@gmail.com> wrote:
> On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson
> <er...@gmail.com> wrote:
>> This is quite odd, it really sounds like you're not
>> actually committing. So, some questions.
>>
>> 1> What happens if you search before you shut
>> down your tomcat? Do you see docs then? If so,
>> somehow you're doing soft commits and never
>> doing a hard commit.

Yeah I just realized the behavior is the same as softCommit, is it the
default for commitWithin?

Cheers,

/jonathan

>>
>> 2> What happens if, as the last statement in your SolrJ
>> program you do a commit()?
>
> When using commitWithin, if I introduce server.commit() within the
> data load process the data gets commited ( I didn't reproduce with my
> 89G of data...), if I shutdown my EmbeddedServer and restart it and
> send a commit, like on Tomcat, all data gets wiped out too. So I guess
> that there's state loss somewhere.
>
> Cheers,
>
> /jonathan
>
>>
>> 3> While you're indexing, what do you see in your index
>> directory? You should see multiple segments being
>> created, and possibly merged so the number of
>> files should go up and down. If you only have a single
>> set of files, you're somehow not doing a commit.
>>
>> 4> Is there something really silly going on like your
>> restart scripts delete the index directory? Or you're
>> using a VM that restores a blank image?
>>
>> 5> When you do restart, are there any files at all
>> in your index directory?
>>
>> I really suspect you've got some configuration problem
>> here....
>>
>> Best
>> Erick
>>
>>
>>
>> On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
>> <jo...@gmail.com> wrote:
>>> Hi,
>>>
>>> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>>>
>>> Within my SolrJ application, the documents are added to the server
>>> using the commitWithin parameter (in my case 60s). After 1 day my 125
>>> millions document are all added to the server and I can see 89G of
>>> index data files. I stop my SolrJ application and reload my Solr
>>> instance in Tomcat.
>>>
>>> From the Solr admin panel related to my Core (collection1) I see this info:
>>>
>>>
>>> Last Modified:
>>> Num Docs:0
>>> Max Doc:0
>>> Version:1
>>> Segment Count:0
>>> Optimized: (green check)
>>> Current:  (green check)
>>> Master:
>>> Version: 0
>>> Gen: 1
>>> Size: 88.14 GB
>>>
>>>
>>> From the general Core Admin panel I see:
>>>
>>> lastModified:
>>> version:1
>>> numDocs:0
>>> maxDoc:0
>>> optimized: (red circle)
>>> current: (green check)
>>> hasDeletions: (red circle)
>>>
>>> If I query my index for *:* I get 0 result. If I trigger optimize it
>>> wipes ALL my data inside the index and reset to empty. I've played
>>> around my EmbeddedServer initially using autoCommit/softCommit and it
>>> was working fine. Now that I've switched to commitWithin the document
>>> add query, it always do that! I'm never able to reload my index within
>>> Tomcat/Solr.
>>>
>>> Any idea?
>>>
>>> Cheers,
>>>
>>> /jonathan

Re: Index not loading

Posted by Jonatan Fournier <jo...@gmail.com>.
On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson
<er...@gmail.com> wrote:
> This is quite odd, it really sounds like you're not
> actually committing. So, some questions.
>
> 1> What happens if you search before you shut
> down your tomcat? Do you see docs then? If so,
> somehow you're doing soft commits and never
> doing a hard commit.
>
> 2> What happens if, as the last statement in your SolrJ
> program you do a commit()?

When using commitWithin, if I introduce server.commit() within the
data load process the data gets commited ( I didn't reproduce with my
89G of data...), if I shutdown my EmbeddedServer and restart it and
send a commit, like on Tomcat, all data gets wiped out too. So I guess
that there's state loss somewhere.

Cheers,

/jonathan

>
> 3> While you're indexing, what do you see in your index
> directory? You should see multiple segments being
> created, and possibly merged so the number of
> files should go up and down. If you only have a single
> set of files, you're somehow not doing a commit.
>
> 4> Is there something really silly going on like your
> restart scripts delete the index directory? Or you're
> using a VM that restores a blank image?
>
> 5> When you do restart, are there any files at all
> in your index directory?
>
> I really suspect you've got some configuration problem
> here....
>
> Best
> Erick
>
>
>
> On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
> <jo...@gmail.com> wrote:
>> Hi,
>>
>> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>>
>> Within my SolrJ application, the documents are added to the server
>> using the commitWithin parameter (in my case 60s). After 1 day my 125
>> millions document are all added to the server and I can see 89G of
>> index data files. I stop my SolrJ application and reload my Solr
>> instance in Tomcat.
>>
>> From the Solr admin panel related to my Core (collection1) I see this info:
>>
>>
>> Last Modified:
>> Num Docs:0
>> Max Doc:0
>> Version:1
>> Segment Count:0
>> Optimized: (green check)
>> Current:  (green check)
>> Master:
>> Version: 0
>> Gen: 1
>> Size: 88.14 GB
>>
>>
>> From the general Core Admin panel I see:
>>
>> lastModified:
>> version:1
>> numDocs:0
>> maxDoc:0
>> optimized: (red circle)
>> current: (green check)
>> hasDeletions: (red circle)
>>
>> If I query my index for *:* I get 0 result. If I trigger optimize it
>> wipes ALL my data inside the index and reset to empty. I've played
>> around my EmbeddedServer initially using autoCommit/softCommit and it
>> was working fine. Now that I've switched to commitWithin the document
>> add query, it always do that! I'm never able to reload my index within
>> Tomcat/Solr.
>>
>> Any idea?
>>
>> Cheers,
>>
>> /jonathan

Re: Index not loading

Posted by Jonatan Fournier <jo...@gmail.com>.
That's the commit log I'm getting when using commitWithin:

Aug 14, 2012 12:53:52 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit{flags=0,version=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true}
Aug 14, 2012 12:53:52 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@1ec3756d main
Aug 14, 2012 12:53:52 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 14, 2012 12:53:52 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@1ec3756d
main{StandardDirectoryReader(segments_1:57:nrt _0(4.0):C12847
_5(4.0):C49855 _6(4.0):C12195 _d(4.0):C47716 _8(4.0):C4340
_p(4.0):C48589 _i(4.0):C46026 _j(4.0):C2599 _n(4.0):C12012
_o(4.0):C11943 _q(4.0):C11783 _r(4.0):C2610)}
Aug 14, 2012 12:53:52 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 14, 2012 12:53:52 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [collection1] Registered new searcher Searcher@1ec3756d
main{StandardDirectoryReader(segments_1:57:nrt _0(4.0):C12847
_5(4.0):C49855 _6(4.0):C12195 _d(4.0):C47716 _8(4.0):C4340
_p(4.0):C48589 _i(4.0):C46026 _j(4.0):C2599 _n(4.0):C12012
_o(4.0):C11943 _q(4.0):C11783 _r(4.0):C2610)}

When setting autoCommit in solrconfig.xml I get a more verbose output:

Aug 14, 2012 1:02:59 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit{flags=0,version=0,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false}
Aug 14, 2012 1:02:59 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: start commit{flags=0,version=0,optimize=false,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=true}
Aug 14, 2012 1:02:59 PM org.apache.solr.search.SolrIndexSearcher <init>
INFO: Opening Searcher@b224fa3 main
Aug 14, 2012 1:02:59 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush
Aug 14, 2012 1:02:59 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener sending requests to Searcher@b224fa3
main{StandardDirectoryReader(segments_4:67:nrt _0(4.0):C12847
_5(4.0):C49855 _6(4.0):C12195 _p(4.0):C48396 _8(4.0):C7785
_9(4.0):C777 _a(4.0):C3445 _j(4.0):C47648 _v(4.0):C46567 _k(4.0):C200
_l(4.0):C964 _m(4.0):C2958 _r(4.0):C10221 _s(4.0):C1674 _t(4.0):C632
_u(4.0):C12399 _w(4.0):C3952)}
Aug 14, 2012 1:02:59 PM org.apache.solr.core.QuerySenderListener newSearcher
INFO: QuerySenderListener done.
Aug 14, 2012 1:02:59 PM org.apache.solr.core.SolrCore registerSearcher
INFO: [collection1] Registered new searcher Searcher@b224fa3
main{StandardDirectoryReader(segments_4:67:nrt _0(4.0):C12847
_5(4.0):C49855 _6(4.0):C12195 _p(4.0):C48396 _8(4.0):C7785
_9(4.0):C777 _a(4.0):C3445 _j(4.0):C47648 _v(4.0):C46567 _k(4.0):C200
_l(4.0):C964 _m(4.0):C2958 _r(4.0):C10221 _s(4.0):C1674 _t(4.0):C632
_u(4.0):C12399 _w(4.0):C3952)}
Aug 14, 2012 1:02:59 PM org.apache.solr.core.SolrDeletionPolicy onCommit
INFO: SolrDeletionPolicy.onCommit: commits:num=2
	commit{dir=/mnt/data/solr/couids/data/index,segFN=segments_4,generation=4,filenames=[_5_nrm.cfe,
_s.fdt, _l.si, _k.fnm, _0_Lucene40_0.prx, _r_nrm.cfs, _m.si, _8.si,
_8_Lucene40_0.frq, _q.si, _h.fdt, _a_Lucene40_0.frq, _h.fdx,
_h_Lucene40_0.frq, _r_nrm.cfe, _g.fdt, _0_Lucene40_0.tim, _g.fdx,
_s.si, _9.fdt, _9.fdx, _h_Lucene40_0.tim, _9_Lucene40_0.frq,
_0_Lucene40_0.tip, _h_Lucene40_0.tip, _5_nrm.cfs, _l.fdx,
_l_Lucene40_0.prx, _l.fdt, _i_Lucene40_0.frq, _6.fdt, _a.fnm,
_k_Lucene40_0.tim, _j.fdx, _m_Lucene40_0.frq, _k_Lucene40_0.tip,
_r_Lucene40_0.frq, _j.fdt, _6.fdx, _a_Lucene40_0.tim,
_a_Lucene40_0.tip, _m_Lucene40_0.tip, _m_Lucene40_0.tim, _i.si,
_k_Lucene40_0.frq, _i_nrm.cfe, _i_nrm.cfs, _r.fdt, _r.fnm,
_5_Lucene40_0.tim, _0_Lucene40_0.frq, _5_Lucene40_0.tip,
_g_Lucene40_0.tim, _r.fdx, _r_Lucene40_0.tim, _g_Lucene40_0.tip,
_i.fdx, _r_Lucene40_0.tip, _i.fdt, _m_Lucene40_0.prx, _j.si,
_g_nrm.cfs, _9.fnm, _q_Lucene40_0.frq, _p.si, _g_Lucene40_0.frq,
_j_Lucene40_0.tip, _g_Lucene40_0.prx, _p_Lucene40_0.prx,
_j_Lucene40_0.tim, _s_Lucene40_0.prx, _m.fdt, _g_nrm.cfe, _m.fdx,
_6.si, _6.fnm, _5_Lucene40_0.prx, _8_nrm.cfe, _8_Lucene40_0.tim,
_p.fdx, _5.fdt, _l_nrm.cfe, _6_Lucene40_0.tim, _p.fdt,
_6_Lucene40_0.tip, _q.fdx, _s_Lucene40_0.frq, _i_Lucene40_0.tim,
_q.fdt, _l_Lucene40_0.tim, _l_nrm.cfs, _q_Lucene40_0.tim,
_i_Lucene40_0.tip, _l_Lucene40_0.tip, _h_Lucene40_0.prx, _h.si,
_k_Lucene40_0.prx, _9_nrm.cfs, _9_Lucene40_0.tip, _9.si,
_j_Lucene40_0.frq, _m.fnm, _k.si, _q_Lucene40_0.tip, _s_nrm.cfe,
_m_nrm.cfe, _p_Lucene40_0.frq, _k_nrm.cfe, _5_Lucene40_0.frq,
_a_nrm.cfe, _h.fnm, _0.fnm, _j_nrm.cfe, _a_Lucene40_0.prx, _q_nrm.cfe,
_9_nrm.cfe, _8_Lucene40_0.prx, _s_Lucene40_0.tip, _i_Lucene40_0.prx,
_s_Lucene40_0.tim, _q_nrm.cfs, _a.si, _a_nrm.cfs, _r_Lucene40_0.prx,
_s_nrm.cfs, _6_Lucene40_0.frq, _p_nrm.cfe, _8_nrm.cfs, _5.si,
_k_nrm.cfs, _8.fnm, _m_nrm.cfs, _6_Lucene40_0.prx, _r.si, _q.fnm,
_p_nrm.cfs, _8_Lucene40_0.tip, _j_nrm.cfs, _q_Lucene40_0.prx, _g.si,
_l.fnm, _p.fnm, _k.fdt, _k.fdx, _h_nrm.cfe, _s.fnm, _a.fdt,
_9_Lucene40_0.prx, _a.fdx, _l_Lucene40_0.frq, _g.fnm, _6_nrm.cfs,
_p_Lucene40_0.tim, _h_nrm.cfs, _p_Lucene40_0.tip, _0.si, _5.fnm,
_9_Lucene40_0.tim, _j_Lucene40_0.prx, _6_nrm.cfe, _0_nrm.cfs, _s.fdx,
_j.fnm, _0_nrm.cfe, _5.fdx, _0.fdx, _8.fdx, _i.fnm, _0.fdt,
segments_4, _8.fdt]
	commit{dir=/mnt/data/solr/couids/data/index,segFN=segments_5,generation=5,filenames=[_5_nrm.cfe,
_v.fdx, _s.fdt, _l.si, _w_Lucene40_0.prx, _k.fnm, _0_Lucene40_0.prx,
_r_nrm.cfs, _m.si, _8.si, _8_Lucene40_0.frq, _a_Lucene40_0.frq,
_v.fnm, _w.fnm, _r_nrm.cfe, _0_Lucene40_0.tim, _w.fdt, _s.si, _w.fdx,
_t_Lucene40_0.tim, _9.fdt, _t_Lucene40_0.tip, _9.fdx,
_u_Lucene40_0.frq, _9_Lucene40_0.frq, _0_Lucene40_0.tip, _5_nrm.cfs,
_l.fdx, _l_Lucene40_0.prx, _l.fdt, _6.fdt, _t.fdt, _a.fnm, _j.fdx,
_k_Lucene40_0.tim, _w.si, _m_Lucene40_0.frq, _k_Lucene40_0.tip,
_r_Lucene40_0.frq, _j.fdt, _6.fdx, _a_Lucene40_0.tim, _u.fdx,
_t_Lucene40_0.prx, _a_Lucene40_0.tip, _v_Lucene40_0.frq,
_m_Lucene40_0.tip, _m_Lucene40_0.tim, _k_Lucene40_0.frq, _r.fdt,
_r.fnm, _u.fnm, _5_Lucene40_0.tim, _0_Lucene40_0.frq,
_5_Lucene40_0.tip, _r.fdx, _r_Lucene40_0.tim, _r_Lucene40_0.tip,
_m_Lucene40_0.prx, _j.si, _v.si, _9.fnm, _p.si, _j_Lucene40_0.tip,
_v_Lucene40_0.prx, _p_Lucene40_0.prx, _j_Lucene40_0.tim,
_v_Lucene40_0.tip, _s_Lucene40_0.prx, _m.fdt, _v_Lucene40_0.tim,
_m.fdx, _6.si, _6.fnm, _5_Lucene40_0.prx, _8_nrm.cfe,
_8_Lucene40_0.tim, _p.fdx, _5.fdt, _l_nrm.cfe, _6_Lucene40_0.tim,
_p.fdt, _6_Lucene40_0.tip, _u_Lucene40_0.tip, _t_Lucene40_0.frq,
_s_Lucene40_0.frq, _u_Lucene40_0.tim, _l_Lucene40_0.tim, _l_nrm.cfs,
_l_Lucene40_0.tip, _9_nrm.cfs, _k_Lucene40_0.prx, _9_Lucene40_0.tip,
_9.si, _j_Lucene40_0.frq, _m.fnm, _k.si, _s_nrm.cfe, _m_nrm.cfe,
_p_Lucene40_0.frq, _5_Lucene40_0.frq, _a_nrm.cfe, _k_nrm.cfe, _0.fnm,
_j_nrm.cfe, _a_Lucene40_0.prx, _9_nrm.cfe, _8_Lucene40_0.prx,
_s_Lucene40_0.tip, _s_Lucene40_0.tim, _a.si, _a_nrm.cfs,
_r_Lucene40_0.prx, _s_nrm.cfs, _6_Lucene40_0.frq, _p_nrm.cfe,
_8_nrm.cfs, _5.si, _k_nrm.cfs, _8.fnm, _m_nrm.cfs, _u.si, _u.fdt,
_6_Lucene40_0.prx, _r.si, _p_nrm.cfs, _8_Lucene40_0.tip, _j_nrm.cfs,
_l.fnm, _t.fnm, _p.fnm, _k.fdt, _w_Lucene40_0.tip, _k.fdx, _s.fnm,
_a.fdt, _w_Lucene40_0.tim, _t_nrm.cfs, _9_Lucene40_0.prx, _v_nrm.cfs,
_a.fdx, _l_Lucene40_0.frq, _t.si, _6_nrm.cfs, _u_nrm.cfs,
_p_Lucene40_0.tim, _p_Lucene40_0.tip, _w_nrm.cfe, _0.si,
_w_Lucene40_0.frq, _u_Lucene40_0.prx, _5.fnm, _9_Lucene40_0.tim,
_j_Lucene40_0.prx, _v.fdt, _u_nrm.cfe, _6_nrm.cfe, _w_nrm.cfs,
_0_nrm.cfs, _s.fdx, _j.fnm, _0_nrm.cfe, _t.fdx, _5.fdx, _v_nrm.cfe,
_t_nrm.cfe, _0.fdx, _8.fdx, segments_5, _0.fdt, _8.fdt]
Aug 14, 2012 1:02:59 PM org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO: newest commit = 5
Aug 14, 2012 1:02:59 PM org.apache.solr.update.DirectUpdateHandler2 commit
INFO: end_commit_flush

I don't think my config is wrong, since using the dummy commitWithin
JSON update is working, that my autoCommit is always working... What
else could be wrong other than the SolrServer in SolrJ?

Cheers,

/jonathan

On Tue, Aug 14, 2012 at 12:30 PM, Jonatan Fournier
<jo...@gmail.com> wrote:
> On Tue, Aug 14, 2012 at 11:14 AM, Jack Krupansky
> <ja...@basetechnology.com> wrote:
>> If you send a dummy document using a curl command, without the commit
>> option, does it auto-commit and become visible in 1 minute?
>
> Sending a JSON document using curl:
>
> {
>   "add": {
>     "commitWithin": 60000,
>     "overwrite": false,
>     "doc": {
>       "id" : "1",
>       "type" : "foo"
>     }
>   }
> }
>
> This worked fine. But If use the EmbeddedServer.add(doc, commitWithin)
> it doesn't show up in the search result.
>
> From this article:
> http://www.cominvent.com/2011/09/09/discover-commitwithin-in-solr/
>
> I see there's is multiple ways to specify this commitWithin options:
>
> https://issues.apache.org/jira/browse/SOLR-2742 introduced it to the
> .add() methods for SolrServer, could it be broken only there?
>
> I will go try this syntax:
>
>     UpdateRequest req = new UpdateRequest();
>     req.add(mySolrInputDocument);
>     req.setCommitWithin(10000);
>     req.process(server);
>
> Cheers,
>
> /jonathan
>
>>
>> -- Jack Krupansky
>>
>> -----Original Message----- From: Jonatan Fournier
>> Sent: Tuesday, August 14, 2012 11:03 AM
>> To: solr-user@lucene.apache.org ; erickerickson@gmail.com
>> Subject: Re: Index not loading
>>
>>
>> Hi Erick,
>>
>> On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson
>> <er...@gmail.com> wrote:
>>>
>>> This is quite odd, it really sounds like you're not
>>> actually committing. So, some questions.
>>>
>>> 1> What happens if you search before you shut
>>> down your tomcat? Do you see docs then? If so,
>>> somehow you're doing soft commits and never
>>> doing a hard commit.
>>
>>
>> No I'm not seeing any documents if I do search for anything. Like
>> mentioned above, Num and Max docs are 0.
>>
>> Like I mentioned below, my index files are not deleted when I
>> start/restart tomcat, but when within tomcat I send a commit/optimize
>> command.
>>
>> On thing I noticed that was different in the log output from the
>> embedded server was that when I use the solrconfig.xml autoCommit,
>> after the delay I see some stdout message about commiting to the
>> index. But when relying on the commitWithin, I never see the solr
>> server output freeze for a moment while commiting, I only see all my
>> add document stdout message. Should the behavior be the same? Or the
>> commit messages pass by so fast I don't see them?
>>
>> It must be trying to do some kind of commit/merge, because when I was
>> monitoring the memory I could see periodic memory increase (when I
>> assumed it was merging) then memory decreased until the next delay...
>>
>>>
>>> 2> What happens if, as the last statement in your SolrJ
>>> program you do a commit()?
>>
>>
>> Let me try that and come back to you, for now here's the commands I
>> was using in the 3 test scenarios:
>>
>> SolrInputDocument doc = new SolrInputDocument();
>> solrDoc.addField("id", someId);
>> ...
>> server.add(doc); // In the case I have either autoCommit
>> <maxTime>60000</maxTime> enabled in the solrconfig.xml or
>> <autoSoftCommit>
>> // Both scenarios works, in those 2 cases when I shutdown my
>> embeddedserver and restart tomcat I have all my data indexed/commited
>>
>> or
>>
>> server.add(doc, 60000) // In the case I don't have autoCommit enabled,
>> try to rely on commitWithin param.
>>
>>
>>>
>>> 3> While you're indexing, what do you see in your index
>>> directory? You should see multiple segments being
>>> created, and possibly merged so the number of
>>> files should go up and down. If you only have a single
>>> set of files, you're somehow not doing a commit.
>>
>>
>> No I do see a bunch of files being created/merged, at the end I had a
>> bout 89G in many many files.
>>
>> Another thing I was playing around when trying to use the commitWithin
>> is to change the <useCompoundFile>true</useCompoundFile> and
>> <mergeFactor>10</mergeFactor> to reduce the number of files created.
>> Could it impact things?
>>
>>>
>>> 4> Is there something really silly going on like your
>>> restart scripts delete the index directory? Or you're
>>> using a VM that restores a blank image?
>>
>>
>> No VM, no scripts, no replication.
>>
>>>
>>> 5> When you do restart, are there any files at all
>>> in your index directory?
>>
>>
>> When I restart tomcat I do see all the same 89G files that was created
>> using the embedded server, they only vanish when I force a commit or
>> optimize, then it's like if my data directory didn't exist and the 2
>> initial segment files are being created and all the rest deleted.
>>
>>>
>>> I really suspect you've got some configuration problem
>>> here....
>>
>>
>> Maybe, but other than playing with the compound file thingy I don't
>> have any fancy config changes.
>>
>> Cheers,
>>
>> /jonathan
>>
>>>
>>> Best
>>> Erick
>>>
>>>
>>>
>>> On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
>>> <jo...@gmail.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>>>>
>>>> Within my SolrJ application, the documents are added to the server
>>>> using the commitWithin parameter (in my case 60s). After 1 day my 125
>>>> millions document are all added to the server and I can see 89G of
>>>> index data files. I stop my SolrJ application and reload my Solr
>>>> instance in Tomcat.
>>>>
>>>> From the Solr admin panel related to my Core (collection1) I see this
>>>> info:
>>>>
>>>>
>>>> Last Modified:
>>>> Num Docs:0
>>>> Max Doc:0
>>>> Version:1
>>>> Segment Count:0
>>>> Optimized: (green check)
>>>> Current:  (green check)
>>>> Master:
>>>> Version: 0
>>>> Gen: 1
>>>> Size: 88.14 GB
>>>>
>>>>
>>>> From the general Core Admin panel I see:
>>>>
>>>> lastModified:
>>>> version:1
>>>> numDocs:0
>>>> maxDoc:0
>>>> optimized: (red circle)
>>>> current: (green check)
>>>> hasDeletions: (red circle)
>>>>
>>>> If I query my index for *:* I get 0 result. If I trigger optimize it
>>>> wipes ALL my data inside the index and reset to empty. I've played
>>>> around my EmbeddedServer initially using autoCommit/softCommit and it
>>>> was working fine. Now that I've switched to commitWithin the document
>>>> add query, it always do that! I'm never able to reload my index within
>>>> Tomcat/Solr.
>>>>
>>>> Any idea?
>>>>
>>>> Cheers,
>>>>
>>>> /jonathan
>>
>>

Re: Index not loading

Posted by Jonatan Fournier <jo...@gmail.com>.
On Tue, Aug 14, 2012 at 11:14 AM, Jack Krupansky
<ja...@basetechnology.com> wrote:
> If you send a dummy document using a curl command, without the commit
> option, does it auto-commit and become visible in 1 minute?

Sending a JSON document using curl:

{
  "add": {
    "commitWithin": 60000,
    "overwrite": false,
    "doc": {
      "id" : "1",
      "type" : "foo"
    }
  }
}

This worked fine. But If use the EmbeddedServer.add(doc, commitWithin)
it doesn't show up in the search result.

>From this article:
http://www.cominvent.com/2011/09/09/discover-commitwithin-in-solr/

I see there's is multiple ways to specify this commitWithin options:

https://issues.apache.org/jira/browse/SOLR-2742 introduced it to the
.add() methods for SolrServer, could it be broken only there?

I will go try this syntax:

    UpdateRequest req = new UpdateRequest();
    req.add(mySolrInputDocument);
    req.setCommitWithin(10000);
    req.process(server);

Cheers,

/jonathan

>
> -- Jack Krupansky
>
> -----Original Message----- From: Jonatan Fournier
> Sent: Tuesday, August 14, 2012 11:03 AM
> To: solr-user@lucene.apache.org ; erickerickson@gmail.com
> Subject: Re: Index not loading
>
>
> Hi Erick,
>
> On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson
> <er...@gmail.com> wrote:
>>
>> This is quite odd, it really sounds like you're not
>> actually committing. So, some questions.
>>
>> 1> What happens if you search before you shut
>> down your tomcat? Do you see docs then? If so,
>> somehow you're doing soft commits and never
>> doing a hard commit.
>
>
> No I'm not seeing any documents if I do search for anything. Like
> mentioned above, Num and Max docs are 0.
>
> Like I mentioned below, my index files are not deleted when I
> start/restart tomcat, but when within tomcat I send a commit/optimize
> command.
>
> On thing I noticed that was different in the log output from the
> embedded server was that when I use the solrconfig.xml autoCommit,
> after the delay I see some stdout message about commiting to the
> index. But when relying on the commitWithin, I never see the solr
> server output freeze for a moment while commiting, I only see all my
> add document stdout message. Should the behavior be the same? Or the
> commit messages pass by so fast I don't see them?
>
> It must be trying to do some kind of commit/merge, because when I was
> monitoring the memory I could see periodic memory increase (when I
> assumed it was merging) then memory decreased until the next delay...
>
>>
>> 2> What happens if, as the last statement in your SolrJ
>> program you do a commit()?
>
>
> Let me try that and come back to you, for now here's the commands I
> was using in the 3 test scenarios:
>
> SolrInputDocument doc = new SolrInputDocument();
> solrDoc.addField("id", someId);
> ...
> server.add(doc); // In the case I have either autoCommit
> <maxTime>60000</maxTime> enabled in the solrconfig.xml or
> <autoSoftCommit>
> // Both scenarios works, in those 2 cases when I shutdown my
> embeddedserver and restart tomcat I have all my data indexed/commited
>
> or
>
> server.add(doc, 60000) // In the case I don't have autoCommit enabled,
> try to rely on commitWithin param.
>
>
>>
>> 3> While you're indexing, what do you see in your index
>> directory? You should see multiple segments being
>> created, and possibly merged so the number of
>> files should go up and down. If you only have a single
>> set of files, you're somehow not doing a commit.
>
>
> No I do see a bunch of files being created/merged, at the end I had a
> bout 89G in many many files.
>
> Another thing I was playing around when trying to use the commitWithin
> is to change the <useCompoundFile>true</useCompoundFile> and
> <mergeFactor>10</mergeFactor> to reduce the number of files created.
> Could it impact things?
>
>>
>> 4> Is there something really silly going on like your
>> restart scripts delete the index directory? Or you're
>> using a VM that restores a blank image?
>
>
> No VM, no scripts, no replication.
>
>>
>> 5> When you do restart, are there any files at all
>> in your index directory?
>
>
> When I restart tomcat I do see all the same 89G files that was created
> using the embedded server, they only vanish when I force a commit or
> optimize, then it's like if my data directory didn't exist and the 2
> initial segment files are being created and all the rest deleted.
>
>>
>> I really suspect you've got some configuration problem
>> here....
>
>
> Maybe, but other than playing with the compound file thingy I don't
> have any fancy config changes.
>
> Cheers,
>
> /jonathan
>
>>
>> Best
>> Erick
>>
>>
>>
>> On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
>> <jo...@gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>>>
>>> Within my SolrJ application, the documents are added to the server
>>> using the commitWithin parameter (in my case 60s). After 1 day my 125
>>> millions document are all added to the server and I can see 89G of
>>> index data files. I stop my SolrJ application and reload my Solr
>>> instance in Tomcat.
>>>
>>> From the Solr admin panel related to my Core (collection1) I see this
>>> info:
>>>
>>>
>>> Last Modified:
>>> Num Docs:0
>>> Max Doc:0
>>> Version:1
>>> Segment Count:0
>>> Optimized: (green check)
>>> Current:  (green check)
>>> Master:
>>> Version: 0
>>> Gen: 1
>>> Size: 88.14 GB
>>>
>>>
>>> From the general Core Admin panel I see:
>>>
>>> lastModified:
>>> version:1
>>> numDocs:0
>>> maxDoc:0
>>> optimized: (red circle)
>>> current: (green check)
>>> hasDeletions: (red circle)
>>>
>>> If I query my index for *:* I get 0 result. If I trigger optimize it
>>> wipes ALL my data inside the index and reset to empty. I've played
>>> around my EmbeddedServer initially using autoCommit/softCommit and it
>>> was working fine. Now that I've switched to commitWithin the document
>>> add query, it always do that! I'm never able to reload my index within
>>> Tomcat/Solr.
>>>
>>> Any idea?
>>>
>>> Cheers,
>>>
>>> /jonathan
>
>

Re: Index not loading

Posted by Jack Krupansky <ja...@basetechnology.com>.
If you send a dummy document using a curl command, without the commit 
option, does it auto-commit and become visible in 1 minute?

-- Jack Krupansky

-----Original Message----- 
From: Jonatan Fournier
Sent: Tuesday, August 14, 2012 11:03 AM
To: solr-user@lucene.apache.org ; erickerickson@gmail.com
Subject: Re: Index not loading

Hi Erick,

On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson
<er...@gmail.com> wrote:
> This is quite odd, it really sounds like you're not
> actually committing. So, some questions.
>
> 1> What happens if you search before you shut
> down your tomcat? Do you see docs then? If so,
> somehow you're doing soft commits and never
> doing a hard commit.

No I'm not seeing any documents if I do search for anything. Like
mentioned above, Num and Max docs are 0.

Like I mentioned below, my index files are not deleted when I
start/restart tomcat, but when within tomcat I send a commit/optimize
command.

On thing I noticed that was different in the log output from the
embedded server was that when I use the solrconfig.xml autoCommit,
after the delay I see some stdout message about commiting to the
index. But when relying on the commitWithin, I never see the solr
server output freeze for a moment while commiting, I only see all my
add document stdout message. Should the behavior be the same? Or the
commit messages pass by so fast I don't see them?

It must be trying to do some kind of commit/merge, because when I was
monitoring the memory I could see periodic memory increase (when I
assumed it was merging) then memory decreased until the next delay...

>
> 2> What happens if, as the last statement in your SolrJ
> program you do a commit()?

Let me try that and come back to you, for now here's the commands I
was using in the 3 test scenarios:

SolrInputDocument doc = new SolrInputDocument();
solrDoc.addField("id", someId);
...
server.add(doc); // In the case I have either autoCommit
<maxTime>60000</maxTime> enabled in the solrconfig.xml or
<autoSoftCommit>
// Both scenarios works, in those 2 cases when I shutdown my
embeddedserver and restart tomcat I have all my data indexed/commited

or

server.add(doc, 60000) // In the case I don't have autoCommit enabled,
try to rely on commitWithin param.


>
> 3> While you're indexing, what do you see in your index
> directory? You should see multiple segments being
> created, and possibly merged so the number of
> files should go up and down. If you only have a single
> set of files, you're somehow not doing a commit.

No I do see a bunch of files being created/merged, at the end I had a
bout 89G in many many files.

Another thing I was playing around when trying to use the commitWithin
is to change the <useCompoundFile>true</useCompoundFile> and
<mergeFactor>10</mergeFactor> to reduce the number of files created.
Could it impact things?

>
> 4> Is there something really silly going on like your
> restart scripts delete the index directory? Or you're
> using a VM that restores a blank image?

No VM, no scripts, no replication.

>
> 5> When you do restart, are there any files at all
> in your index directory?

When I restart tomcat I do see all the same 89G files that was created
using the embedded server, they only vanish when I force a commit or
optimize, then it's like if my data directory didn't exist and the 2
initial segment files are being created and all the rest deleted.

>
> I really suspect you've got some configuration problem
> here....

Maybe, but other than playing with the compound file thingy I don't
have any fancy config changes.

Cheers,

/jonathan

>
> Best
> Erick
>
>
>
> On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
> <jo...@gmail.com> wrote:
>> Hi,
>>
>> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>>
>> Within my SolrJ application, the documents are added to the server
>> using the commitWithin parameter (in my case 60s). After 1 day my 125
>> millions document are all added to the server and I can see 89G of
>> index data files. I stop my SolrJ application and reload my Solr
>> instance in Tomcat.
>>
>> From the Solr admin panel related to my Core (collection1) I see this 
>> info:
>>
>>
>> Last Modified:
>> Num Docs:0
>> Max Doc:0
>> Version:1
>> Segment Count:0
>> Optimized: (green check)
>> Current:  (green check)
>> Master:
>> Version: 0
>> Gen: 1
>> Size: 88.14 GB
>>
>>
>> From the general Core Admin panel I see:
>>
>> lastModified:
>> version:1
>> numDocs:0
>> maxDoc:0
>> optimized: (red circle)
>> current: (green check)
>> hasDeletions: (red circle)
>>
>> If I query my index for *:* I get 0 result. If I trigger optimize it
>> wipes ALL my data inside the index and reset to empty. I've played
>> around my EmbeddedServer initially using autoCommit/softCommit and it
>> was working fine. Now that I've switched to commitWithin the document
>> add query, it always do that! I'm never able to reload my index within
>> Tomcat/Solr.
>>
>> Any idea?
>>
>> Cheers,
>>
>> /jonathan 


Re: Index not loading

Posted by Jonatan Fournier <jo...@gmail.com>.
Hi Erick,

On Tue, Aug 14, 2012 at 10:25 AM, Erick Erickson
<er...@gmail.com> wrote:
> This is quite odd, it really sounds like you're not
> actually committing. So, some questions.
>
> 1> What happens if you search before you shut
> down your tomcat? Do you see docs then? If so,
> somehow you're doing soft commits and never
> doing a hard commit.

No I'm not seeing any documents if I do search for anything. Like
mentioned above, Num and Max docs are 0.

Like I mentioned below, my index files are not deleted when I
start/restart tomcat, but when within tomcat I send a commit/optimize
command.

On thing I noticed that was different in the log output from the
embedded server was that when I use the solrconfig.xml autoCommit,
after the delay I see some stdout message about commiting to the
index. But when relying on the commitWithin, I never see the solr
server output freeze for a moment while commiting, I only see all my
add document stdout message. Should the behavior be the same? Or the
commit messages pass by so fast I don't see them?

It must be trying to do some kind of commit/merge, because when I was
monitoring the memory I could see periodic memory increase (when I
assumed it was merging) then memory decreased until the next delay...

>
> 2> What happens if, as the last statement in your SolrJ
> program you do a commit()?

Let me try that and come back to you, for now here's the commands I
was using in the 3 test scenarios:

SolrInputDocument doc = new SolrInputDocument();
solrDoc.addField("id", someId);
...
server.add(doc); // In the case I have either autoCommit
<maxTime>60000</maxTime> enabled in the solrconfig.xml or
<autoSoftCommit>
// Both scenarios works, in those 2 cases when I shutdown my
embeddedserver and restart tomcat I have all my data indexed/commited

or

server.add(doc, 60000) // In the case I don't have autoCommit enabled,
try to rely on commitWithin param.


>
> 3> While you're indexing, what do you see in your index
> directory? You should see multiple segments being
> created, and possibly merged so the number of
> files should go up and down. If you only have a single
> set of files, you're somehow not doing a commit.

No I do see a bunch of files being created/merged, at the end I had a
bout 89G in many many files.

Another thing I was playing around when trying to use the commitWithin
is to change the <useCompoundFile>true</useCompoundFile> and
<mergeFactor>10</mergeFactor> to reduce the number of files created.
Could it impact things?

>
> 4> Is there something really silly going on like your
> restart scripts delete the index directory? Or you're
> using a VM that restores a blank image?

No VM, no scripts, no replication.

>
> 5> When you do restart, are there any files at all
> in your index directory?

When I restart tomcat I do see all the same 89G files that was created
using the embedded server, they only vanish when I force a commit or
optimize, then it's like if my data directory didn't exist and the 2
initial segment files are being created and all the rest deleted.

>
> I really suspect you've got some configuration problem
> here....

Maybe, but other than playing with the compound file thingy I don't
have any fancy config changes.

Cheers,

/jonathan

>
> Best
> Erick
>
>
>
> On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
> <jo...@gmail.com> wrote:
>> Hi,
>>
>> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>>
>> Within my SolrJ application, the documents are added to the server
>> using the commitWithin parameter (in my case 60s). After 1 day my 125
>> millions document are all added to the server and I can see 89G of
>> index data files. I stop my SolrJ application and reload my Solr
>> instance in Tomcat.
>>
>> From the Solr admin panel related to my Core (collection1) I see this info:
>>
>>
>> Last Modified:
>> Num Docs:0
>> Max Doc:0
>> Version:1
>> Segment Count:0
>> Optimized: (green check)
>> Current:  (green check)
>> Master:
>> Version: 0
>> Gen: 1
>> Size: 88.14 GB
>>
>>
>> From the general Core Admin panel I see:
>>
>> lastModified:
>> version:1
>> numDocs:0
>> maxDoc:0
>> optimized: (red circle)
>> current: (green check)
>> hasDeletions: (red circle)
>>
>> If I query my index for *:* I get 0 result. If I trigger optimize it
>> wipes ALL my data inside the index and reset to empty. I've played
>> around my EmbeddedServer initially using autoCommit/softCommit and it
>> was working fine. Now that I've switched to commitWithin the document
>> add query, it always do that! I'm never able to reload my index within
>> Tomcat/Solr.
>>
>> Any idea?
>>
>> Cheers,
>>
>> /jonathan

Re: Index not loading

Posted by Erick Erickson <er...@gmail.com>.
This is quite odd, it really sounds like you're not
actually committing. So, some questions.

1> What happens if you search before you shut
down your tomcat? Do you see docs then? If so,
somehow you're doing soft commits and never
doing a hard commit.

2> What happens if, as the last statement in your SolrJ
program you do a commit()?

3> While you're indexing, what do you see in your index
directory? You should see multiple segments being
created, and possibly merged so the number of
files should go up and down. If you only have a single
set of files, you're somehow not doing a commit.

4> Is there something really silly going on like your
restart scripts delete the index directory? Or you're
using a VM that restores a blank image?

5> When you do restart, are there any files at all
in your index directory?

I really suspect you've got some configuration problem
here....

Best
Erick



On Mon, Aug 13, 2012 at 9:11 AM, Jonatan Fournier
<jo...@gmail.com> wrote:
> Hi,
>
> I'm using Solr 4.0.0-ALPHA and the EmbeddedSolrServer.
>
> Within my SolrJ application, the documents are added to the server
> using the commitWithin parameter (in my case 60s). After 1 day my 125
> millions document are all added to the server and I can see 89G of
> index data files. I stop my SolrJ application and reload my Solr
> instance in Tomcat.
>
> From the Solr admin panel related to my Core (collection1) I see this info:
>
>
> Last Modified:
> Num Docs:0
> Max Doc:0
> Version:1
> Segment Count:0
> Optimized: (green check)
> Current:  (green check)
> Master:
> Version: 0
> Gen: 1
> Size: 88.14 GB
>
>
> From the general Core Admin panel I see:
>
> lastModified:
> version:1
> numDocs:0
> maxDoc:0
> optimized: (red circle)
> current: (green check)
> hasDeletions: (red circle)
>
> If I query my index for *:* I get 0 result. If I trigger optimize it
> wipes ALL my data inside the index and reset to empty. I've played
> around my EmbeddedServer initially using autoCommit/softCommit and it
> was working fine. Now that I've switched to commitWithin the document
> add query, it always do that! I'm never able to reload my index within
> Tomcat/Solr.
>
> Any idea?
>
> Cheers,
>
> /jonathan