You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Zheng Lin Edwin Yeo <ed...@gmail.com> on 2015/03/19 10:54:46 UTC

Documents cannot be searched immediately when indexed using REST API with Solr Cloud

Hi,

I'm using Solr Cloud now, with 2 shards known as shard1 and shard2, and
when I try to index rich-text documents using REST API or the default
Documents module in Solr Admin UI, the documents that are indexed do not
appear immediately when I do a search. It only appears after I restarted
the Solr services (both shard1 and shard2).

However, the same issue do not happen when I index the same documents using
post.jar, and I can search for the indexed documents immediately.

Here's my ExtractingRequestHandler in solrconfig.xml.

  <requestHandler name="/update/extract"
                  class="solr.extraction.ExtractingRequestHandler" >
    <lst name="defaults">
      <str name="lowernames">true</str>
      <str name="uprefix">ignored_</str>

      <!-- capture link hrefs but ignore div attributes -->
      <str name="captureAttr">true</str>
      <str name="fmap.a">links</str>
      <str name="fmap.div">ignored_</str>
    </lst>
  </requestHandler>

What could be the reason why this is happening, and any solutions to solve
it?

Regards,
Edwin

Re: Documents cannot be searched immediately when indexed using REST API with Solr Cloud

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Thank you for the information.

Yes, the program is working correctly now and I can search for the
documents immediately after issuing commit=true.

Regards,
Edwin


On 20 March 2015 at 04:07, Erick Erickson <er...@gmail.com> wrote:

> The post jar issues a hard commit (openSearcher=true) as part of the
> operation. As Liu says, you are probably not committing the changes
> after ingestion.
>
> You can issue this from a browser:
> .....solr/collection/update?commit=true
> to force a commit manually.
>
> Best,
> Erick
>
> On Thu, Mar 19, 2015 at 3:54 AM, Liu Bo <di...@gmail.com> wrote:
> > Hi Edvin
> >
> > Please review your commit/soft-commit configuration,
> > "soft commits are about visibility, hard commits are about durability"
> >  ---- by a wise man. :)
> >
> > If you are doing NRT index and searching, your probably need a short soft
> > commit interval or commit explicitly in your request handler. Be advised
> > that these strategies and configurations need to be tested and adjusted
> > according to your data size, searching and index updating frequency.
> >
> > You should be able to find the answer yourself here:
> >
> http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >
> > All the best
> >
> > Liu Bo
> >
> > On 19 March 2015 at 17:54, Zheng Lin Edwin Yeo <ed...@gmail.com>
> wrote:
> >
> >> Hi,
> >>
> >> I'm using Solr Cloud now, with 2 shards known as shard1 and shard2, and
> >> when I try to index rich-text documents using REST API or the default
> >> Documents module in Solr Admin UI, the documents that are indexed do not
> >> appear immediately when I do a search. It only appears after I restarted
> >> the Solr services (both shard1 and shard2).
> >>
> >> However, the same issue do not happen when I index the same documents
> using
> >> post.jar, and I can search for the indexed documents immediately.
> >>
> >> Here's my ExtractingRequestHandler in solrconfig.xml.
> >>
> >>   <requestHandler name="/update/extract"
> >>                   class="solr.extraction.ExtractingRequestHandler" >
> >>     <lst name="defaults">
> >>       <str name="lowernames">true</str>
> >>       <str name="uprefix">ignored_</str>
> >>
> >>       <!-- capture link hrefs but ignore div attributes -->
> >>       <str name="captureAttr">true</str>
> >>       <str name="fmap.a">links</str>
> >>       <str name="fmap.div">ignored_</str>
> >>     </lst>
> >>   </requestHandler>
> >>
> >> What could be the reason why this is happening, and any solutions to
> solve
> >> it?
> >>
> >> Regards,
> >> Edwin
> >>
>

Re: Documents cannot be searched immediately when indexed using REST API with Solr Cloud

Posted by Erick Erickson <er...@gmail.com>.
The post jar issues a hard commit (openSearcher=true) as part of the
operation. As Liu says, you are probably not committing the changes
after ingestion.

You can issue this from a browser:
.....solr/collection/update?commit=true
to force a commit manually.

Best,
Erick

On Thu, Mar 19, 2015 at 3:54 AM, Liu Bo <di...@gmail.com> wrote:
> Hi Edvin
>
> Please review your commit/soft-commit configuration,
> "soft commits are about visibility, hard commits are about durability"
>  ---- by a wise man. :)
>
> If you are doing NRT index and searching, your probably need a short soft
> commit interval or commit explicitly in your request handler. Be advised
> that these strategies and configurations need to be tested and adjusted
> according to your data size, searching and index updating frequency.
>
> You should be able to find the answer yourself here:
> http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> All the best
>
> Liu Bo
>
> On 19 March 2015 at 17:54, Zheng Lin Edwin Yeo <ed...@gmail.com> wrote:
>
>> Hi,
>>
>> I'm using Solr Cloud now, with 2 shards known as shard1 and shard2, and
>> when I try to index rich-text documents using REST API or the default
>> Documents module in Solr Admin UI, the documents that are indexed do not
>> appear immediately when I do a search. It only appears after I restarted
>> the Solr services (both shard1 and shard2).
>>
>> However, the same issue do not happen when I index the same documents using
>> post.jar, and I can search for the indexed documents immediately.
>>
>> Here's my ExtractingRequestHandler in solrconfig.xml.
>>
>>   <requestHandler name="/update/extract"
>>                   class="solr.extraction.ExtractingRequestHandler" >
>>     <lst name="defaults">
>>       <str name="lowernames">true</str>
>>       <str name="uprefix">ignored_</str>
>>
>>       <!-- capture link hrefs but ignore div attributes -->
>>       <str name="captureAttr">true</str>
>>       <str name="fmap.a">links</str>
>>       <str name="fmap.div">ignored_</str>
>>     </lst>
>>   </requestHandler>
>>
>> What could be the reason why this is happening, and any solutions to solve
>> it?
>>
>> Regards,
>> Edwin
>>

Re: Documents cannot be searched immediately when indexed using REST API with Solr Cloud

Posted by Liu Bo <di...@gmail.com>.
Hi Edvin

Please review your commit/soft-commit configuration,
"soft commits are about visibility, hard commits are about durability"
 ---- by a wise man. :)

If you are doing NRT index and searching, your probably need a short soft
commit interval or commit explicitly in your request handler. Be advised
that these strategies and configurations need to be tested and adjusted
according to your data size, searching and index updating frequency.

You should be able to find the answer yourself here:
http://lucidworks.com/blog/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

All the best

Liu Bo

On 19 March 2015 at 17:54, Zheng Lin Edwin Yeo <ed...@gmail.com> wrote:

> Hi,
>
> I'm using Solr Cloud now, with 2 shards known as shard1 and shard2, and
> when I try to index rich-text documents using REST API or the default
> Documents module in Solr Admin UI, the documents that are indexed do not
> appear immediately when I do a search. It only appears after I restarted
> the Solr services (both shard1 and shard2).
>
> However, the same issue do not happen when I index the same documents using
> post.jar, and I can search for the indexed documents immediately.
>
> Here's my ExtractingRequestHandler in solrconfig.xml.
>
>   <requestHandler name="/update/extract"
>                   class="solr.extraction.ExtractingRequestHandler" >
>     <lst name="defaults">
>       <str name="lowernames">true</str>
>       <str name="uprefix">ignored_</str>
>
>       <!-- capture link hrefs but ignore div attributes -->
>       <str name="captureAttr">true</str>
>       <str name="fmap.a">links</str>
>       <str name="fmap.div">ignored_</str>
>     </lst>
>   </requestHandler>
>
> What could be the reason why this is happening, and any solutions to solve
> it?
>
> Regards,
> Edwin
>