You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Clemens Wyss DEV <cl...@mysign.ch> on 2016/01/04 10:34:00 UTC

Hard commits, soft commits and transaction logs

[Happy New Year to all]

Is all herein
https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
mentioned/recommended still valid for Solr 5.x?

- Clemens

AW: Hard commits, soft commits and transaction logs

Posted by Clemens Wyss DEV <cl...@mysign.ch>.
Thanks Erick.
I guess I'll go the 3>-way, i.e. optimize the index "whenever appropriate". 
Could I alternatively ("whenever appropriate") issue a '/suggest?spellcheck.build=true'-request?

> bq: Suggestions are re-built on commit
Agree. That was for unitTesting purposes only.
In production we have  <str name="buildOnOptimize">true</str>

-----Ursprüngliche Nachricht-----
Von: Erick Erickson [mailto:erickerickson@gmail.com] 
Gesendet: Donnerstag, 4. Februar 2016 19:39
An: solr-user
Betreff: Re: Hard commits, soft commits and transaction logs

bq: and suggestions of deleted docs are...

OK, this is something different than I read the first time. I'm assuming that when you mention suggestions, you're using one of the suggesters that works off the indexed terms, which will include data from deleted docs. There's really not a good mechanism other than getting all the data associated with deleted documents out of there that I know of in that scenario. What people have done:

1> Just lived with it. On a reasonably large corpus, the number of 
1> suggestions
that aren't actually in a live document is often very small, small enough to ignore. In this case you might be seeing something because of your tests that makes no practical difference.

I'll add parenthetically that users will get empty results even if all the terms suggested are in "live" docs assuming they, say, add filter queries. Imagine a filter query restricting the returns to docs dated yesterday and suggestions come back from docs dated 5 days ago.

2> Curate the suggestions. In this scenario there's a fixed list of 
2> terms in a
text file that you suggest from.

3> Optimize the index. This is usually only really acceptable for setups 
3> where
the index changes infrequently (e.g nightly or something) which doesn't sound like it fits your scenario at all.

bq: Suggestions are re-built on commit

I'm going to go out on a limb and say that this is likely to not work at all for production in a NRT setup. This will take far too much time on a significantly-sized corpus to be feasible. At least that's my fear, I'm mostly advising you to check this before even trying to scale up.

Best,
Erick

On Wed, Feb 3, 2016 at 11:07 PM, Clemens Wyss DEV <cl...@mysign.ch> wrote:
> Sorry for coming back to this topic:
> You (Erick) mention "By and large, do not issue commits from any client indexing to Solr"
>
> In ordert o achieve NRT, I for example test
>      <autoCommit>
>        <maxTime>180000</maxTime> <!-- 3min -->
>        <openSearcher>true</openSearcher>
>      </autoCommit>
>       <autoSoftCommit>
>         <maxTime>10000</maxTime> <!-- 10sec -->
>       </autoSoftCommit>
>
> For (unit)testing purposes
>      <autoCommit>
>        <maxTime>1000</maxTime> <!-- 1sec -->
>        <openSearcher>true</openSearcher>
>      </autoCommit>
>       <autoSoftCommit>
>         <maxTime>500</maxTime> <!-- 500ms -->
>       </autoSoftCommit>
>
> Suggestions are re-built on commit
> ...
> <str name="buildOnCommit">true</str>
> ...
>
> (Almost) all my unit tests pass. Except for my docDeletion-test: it looks like expungeDeletes is never "issued" and suggestions of deleted docs are returned.
> When I explicitly issue an "expunging-soft-commit"
>
> UpdateRequest rq = new UpdateRequest(); rq.setAction( 
> UpdateRequest.ACTION.COMMIT, false, false, 100, true, true ); 
> rq.process( solrClient );
>
> the test passes and no false suggestions are returned. What am I facing?
>
> -----Ursprüngliche Nachricht-----
> Von: Erick Erickson [mailto:erickerickson@gmail.com]
> Gesendet: Montag, 4. Januar 2016 17:36
> An: solr-user
> Betreff: Re: Hard commits, soft commits and transaction logs
>
> As far as I know. If you see anything different, let me know and we'll see if we can update it.
>
> Best,
> Erick
>
> On Mon, Jan 4, 2016 at 1:34 AM, Clemens Wyss DEV <cl...@mysign.ch> wrote:
>> [Happy New Year to all]
>>
>> Is all herein
>> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs
>> -
>> softcommit-and-commit-in-sorlcloud/
>> mentioned/recommended still valid for Solr 5.x?
>>
>> - Clemens

Re: Hard commits, soft commits and transaction logs

Posted by Erick Erickson <er...@gmail.com>.
bq: and suggestions of deleted docs are...

OK, this is something different than I read the first time. I'm
assuming that when you mention suggestions, you're using
one of the suggesters that works off the indexed terms, which
will include data from deleted docs. There's really not a good
mechanism other than getting all the data associated with
deleted documents out of there that I know of in that scenario. What
people have done:

1> Just lived with it. On a reasonably large corpus, the number of suggestions
that aren't actually in a live document is often very small, small enough to
ignore. In this case you might be seeing something because of your tests that
makes no practical difference.

I'll add parenthetically that users will get empty results even if all the terms
suggested are in "live" docs assuming they, say, add filter queries. Imagine
a filter query restricting the returns to docs dated yesterday and suggestions
come back from docs dated 5 days ago.

2> Curate the suggestions. In this scenario there's a fixed list of terms in a
text file that you suggest from.

3> Optimize the index. This is usually only really acceptable for setups where
the index changes infrequently (e.g nightly or something) which doesn't
sound like it fits your scenario at all.

bq: Suggestions are re-built on commit

I'm going to go out on a limb and say that this is likely to not work at all
for production in a NRT setup. This will take far too much time on a
significantly-sized
corpus to be feasible. At least that's my fear, I'm mostly advising you to
check this before even trying to scale up.

Best,
Erick

On Wed, Feb 3, 2016 at 11:07 PM, Clemens Wyss DEV <cl...@mysign.ch> wrote:
> Sorry for coming back to this topic:
> You (Erick) mention "By and large, do not issue commits from any client indexing to Solr"
>
> In ordert o achieve NRT, I for example test
>      <autoCommit>
>        <maxTime>180000</maxTime> <!-- 3min -->
>        <openSearcher>true</openSearcher>
>      </autoCommit>
>       <autoSoftCommit>
>         <maxTime>10000</maxTime> <!-- 10sec -->
>       </autoSoftCommit>
>
> For (unit)testing purposes
>      <autoCommit>
>        <maxTime>1000</maxTime> <!-- 1sec -->
>        <openSearcher>true</openSearcher>
>      </autoCommit>
>       <autoSoftCommit>
>         <maxTime>500</maxTime> <!-- 500ms -->
>       </autoSoftCommit>
>
> Suggestions are re-built on commit
> ...
> <str name="buildOnCommit">true</str>
> ...
>
> (Almost) all my unit tests pass. Except for my docDeletion-test: it looks like expungeDeletes is never "issued" and suggestions of deleted docs are returned.
> When I explicitly issue an "expunging-soft-commit"
>
> UpdateRequest rq = new UpdateRequest();
> rq.setAction( UpdateRequest.ACTION.COMMIT, false, false, 100, true, true );
> rq.process( solrClient );
>
> the test passes and no false suggestions are returned. What am I facing?
>
> -----Ursprüngliche Nachricht-----
> Von: Erick Erickson [mailto:erickerickson@gmail.com]
> Gesendet: Montag, 4. Januar 2016 17:36
> An: solr-user
> Betreff: Re: Hard commits, soft commits and transaction logs
>
> As far as I know. If you see anything different, let me know and we'll see if we can update it.
>
> Best,
> Erick
>
> On Mon, Jan 4, 2016 at 1:34 AM, Clemens Wyss DEV <cl...@mysign.ch> wrote:
>> [Happy New Year to all]
>>
>> Is all herein
>> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-
>> softcommit-and-commit-in-sorlcloud/
>> mentioned/recommended still valid for Solr 5.x?
>>
>> - Clemens

AW: Hard commits, soft commits and transaction logs

Posted by Clemens Wyss DEV <cl...@mysign.ch>.
Sorry for coming back to this topic:
You (Erick) mention "By and large, do not issue commits from any client indexing to Solr"

In ordert o achieve NRT, I for example test
     <autoCommit>
       <maxTime>180000</maxTime> <!-- 3min -->
       <openSearcher>true</openSearcher>
     </autoCommit>
      <autoSoftCommit>
        <maxTime>10000</maxTime> <!-- 10sec -->
      </autoSoftCommit>

For (unit)testing purposes
     <autoCommit>
       <maxTime>1000</maxTime> <!-- 1sec -->
       <openSearcher>true</openSearcher>
     </autoCommit>
      <autoSoftCommit>
        <maxTime>500</maxTime> <!-- 500ms -->
      </autoSoftCommit>

Suggestions are re-built on commit
...
<str name="buildOnCommit">true</str>
...

(Almost) all my unit tests pass. Except for my docDeletion-test: it looks like expungeDeletes is never "issued" and suggestions of deleted docs are returned.
When I explicitly issue an "expunging-soft-commit"

UpdateRequest rq = new UpdateRequest();
rq.setAction( UpdateRequest.ACTION.COMMIT, false, false, 100, true, true );
rq.process( solrClient );

the test passes and no false suggestions are returned. What am I facing?

-----Ursprüngliche Nachricht-----
Von: Erick Erickson [mailto:erickerickson@gmail.com] 
Gesendet: Montag, 4. Januar 2016 17:36
An: solr-user
Betreff: Re: Hard commits, soft commits and transaction logs

As far as I know. If you see anything different, let me know and we'll see if we can update it.

Best,
Erick

On Mon, Jan 4, 2016 at 1:34 AM, Clemens Wyss DEV <cl...@mysign.ch> wrote:
> [Happy New Year to all]
>
> Is all herein
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-
> softcommit-and-commit-in-sorlcloud/
> mentioned/recommended still valid for Solr 5.x?
>
> - Clemens

Re: Hard commits, soft commits and transaction logs

Posted by Erick Erickson <er...@gmail.com>.
As far as I know. If you see anything different, let me know and
we'll see if we can update it.

Best,
Erick

On Mon, Jan 4, 2016 at 1:34 AM, Clemens Wyss DEV <cl...@mysign.ch> wrote:
> [Happy New Year to all]
>
> Is all herein
> https://lucidworks.com/blog/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> mentioned/recommended still valid for Solr 5.x?
>
> - Clemens