You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Nagendra Nagarajayya <nn...@transaxtions.com> on 2011/07/18 16:43:11 UTC
[Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high
performance 10000 tps
Hi!
I would like to announce the availability of Solr 3.3 with
RankingAlgorithm and Near Real Time (NRT) search capability now. The NRT
performance is very high, 10,000 documents/sec with the MBArtists 390k
index. The NRT functionality allows you to add documents without the
IndexSearchers being closed or caches being cleared. A commit is also
not needed with the document update. Searches can run concurrently with
document updates. No changes are needed except for enabling the NRT
through solrconfig.xml.
RankingAlgorithm query performance is now 3x times faster than before
and is exposed as the Lucene API. This release also adds supports for
the last document with a unique id to be searchable and visible in
search results in case of multiple updates of the document.
I have a wiki page that describes NRT performance in detail and can be
accessed from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
You can download Solr 3.3 with RankingAlgorithm (NRT version) from here:
http://solr-ra.tgels.org
I would like to invite you to give this version a try as the performance
is very high.
Regards,
- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org
Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very
high performance 10000 tps
Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Thanks Mark! I made the earlier implementation of NRT with 1.4.1
available to Solr through a JIRA issue:
https://issues.apache.org/jira/browse/SOLR-2568
( I had made available the implementation details through a paper
published at
http://solr-ra.tgels.com/papers/NRT_Solr_RankingAlgorithm.pdf which
includes the source, modifications, etc.)
I plan to make available the current implementation of NRT with Solr
3.2/3.3 and RankingAlgorithm as a patch. This implementation has very
high performance (10000 docs/sec) and in fact on my system is faster
than the normal update/commit.
There are some issues not yet resolved as to when to invalidate/update
the cache but this seems to be not a very easy problem.
Regarding the Lucene list ( I thought both Solr and Lucene were now
shared projects. I can add a message to my emails to make it clear that
Solr with RankingAlgorithm is an external implementation. I also plan to
file an RFE to allow plugin/api support for external text search
libraries support for Solr.
- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org
On 7/18/2011 9:45 AM, Mark Miller wrote:
> Hey Nagendra - I don't mind seeing these external project announces here (though you might keep Solr related announces off the Lucene user list), but please word these announces so that users are not confused that this is an Apache release, and that it is an external project built on top of Apache Solr.
>
> Thanks,
>
> - Mark
>
> On Jul 18, 2011, at 10:43 AM, Nagendra Nagarajayya wrote:
>
>> Hi!
>>
>> I would like to announce the availability of Solr 3.3 with RankingAlgorithm and Near Real Time (NRT) search capability now. The NRT performance is very high, 10,000 documents/sec with the MBArtists 390k index. The NRT functionality allows you to add documents without the IndexSearchers being closed or caches being cleared. A commit is also not needed with the document update. Searches can run concurrently with document updates. No changes are needed except for enabling the NRT through solrconfig.xml.
>>
>> RankingAlgorithm query performance is now 3x times faster than before and is exposed as the Lucene API. This release also adds supports for the last document with a unique id to be searchable and visible in search results in case of multiple updates of the document.
>>
>> I have a wiki page that describes NRT performance in detail and can be accessed from here:
>>
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>
>> You can download Solr 3.3 with RankingAlgorithm (NRT version) from here:
>>
>> http://solr-ra.tgels.org
>>
>> I would like to invite you to give this version a try as the performance is very high.
>>
>> Regards,
>>
>> - Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>>
>>
> - Mark Miller
> lucidimagination.com
>
>
>
>
>
>
>
>
>
>
Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps
Posted by Mark Miller <ma...@gmail.com>.
Hey Nagendra - I don't mind seeing these external project announces here (though you might keep Solr related announces off the Lucene user list), but please word these announces so that users are not confused that this is an Apache release, and that it is an external project built on top of Apache Solr.
Thanks,
- Mark
On Jul 18, 2011, at 10:43 AM, Nagendra Nagarajayya wrote:
> Hi!
>
> I would like to announce the availability of Solr 3.3 with RankingAlgorithm and Near Real Time (NRT) search capability now. The NRT performance is very high, 10,000 documents/sec with the MBArtists 390k index. The NRT functionality allows you to add documents without the IndexSearchers being closed or caches being cleared. A commit is also not needed with the document update. Searches can run concurrently with document updates. No changes are needed except for enabling the NRT through solrconfig.xml.
>
> RankingAlgorithm query performance is now 3x times faster than before and is exposed as the Lucene API. This release also adds supports for the last document with a unique id to be searchable and visible in search results in case of multiple updates of the document.
>
> I have a wiki page that describes NRT performance in detail and can be accessed from here:
>
> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>
> You can download Solr 3.3 with RankingAlgorithm (NRT version) from here:
>
> http://solr-ra.tgels.org
>
> I would like to invite you to give this version a try as the performance is very high.
>
> Regards,
>
> - Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>
>
- Mark Miller
lucidimagination.com
Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very
high performance 10000 tps
Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Yes, this problem has been solved though not completely, there is still
a refresh problem. To eliminate duplicate documents with a unique id
during update, you need to set
<maxBufferedDeleteTerms>1</maxBufferedDeleteTerms>. This makes the most
recent updated document to become searchable as well as removing the
older documents. There is a catch though, if some of the fields in a
document are different and this is updated , older content might show up
as part of the results even though the query matches the most recent
document content ie. if the most recent doc has afield set to
<doc><afield>abc</afield></doc> and this is updated, and the old docs
were <doc><afield>xyz</afield>, at query time, q=afield:abc matches, but
the results show may show <doc><afield>xyz</afield>. I am still
researching this.
You can get more information about the performance and known issues here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver_3.x
Regards,
- Nagendra Nagarajayya
http://solr-ra.tgels.org
http://rankingalgorithm.tgels.org
On 7/19/2011 1:21 AM, Andy wrote:
> Nagendra,
>
> In another email you mentioned there's a problem where if an existing document is updated both the old and new version will show up in search results.
>
> Has that been solved in Solr-RA 3.3?
>
> --- On Mon, 7/18/11, Nagendra Nagarajayya<nn...@transaxtions.com> wrote:
>
>> From: Nagendra Nagarajayya<nn...@transaxtions.com>
>> Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps
>> To: solr-user@lucene.apache.org
>> Date: Monday, July 18, 2011, 10:43 AM
>> Hi!
>>
>> I would like to announce the availability of Solr 3.3 with
>> RankingAlgorithm and Near Real Time (NRT) search capability
>> now. The NRT performance is very high, 10,000 documents/sec
>> with the MBArtists 390k index. The NRT functionality allows
>> you to add documents without the IndexSearchers being closed
>> or caches being cleared. A commit is also not needed with
>> the document update. Searches can run concurrently with
>> document updates. No changes are needed except for enabling
>> the NRT through solrconfig.xml.
>>
>> RankingAlgorithm query performance is now 3x times faster
>> than before and is exposed as the Lucene API. This release
>> also adds supports for the last document with a unique id to
>> be searchable and visible in search results in case of
>> multiple updates of the document.
>>
>> I have a wiki page that describes NRT performance in detail
>> and can be accessed from here:
>>
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>
>> You can download Solr 3.3 with RankingAlgorithm (NRT
>> version) from here:
>>
>> http://solr-ra.tgels.org
>>
>> I would like to invite you to give this version a try as
>> the performance is very high.
>>
>> Regards,
>>
>> - Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>>
>>
>>
>
Re: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps
Posted by Andy <an...@yahoo.com>.
Nagendra,
In another email you mentioned there's a problem where if an existing document is updated both the old and new version will show up in search results.
Has that been solved in Solr-RA 3.3?
--- On Mon, 7/18/11, Nagendra Nagarajayya <nn...@transaxtions.com> wrote:
> From: Nagendra Nagarajayya <nn...@transaxtions.com>
> Subject: [Announce] Solr 3.3 with RankingAlgorithm NRT capability, very high performance 10000 tps
> To: solr-user@lucene.apache.org
> Date: Monday, July 18, 2011, 10:43 AM
> Hi!
>
> I would like to announce the availability of Solr 3.3 with
> RankingAlgorithm and Near Real Time (NRT) search capability
> now. The NRT performance is very high, 10,000 documents/sec
> with the MBArtists 390k index. The NRT functionality allows
> you to add documents without the IndexSearchers being closed
> or caches being cleared. A commit is also not needed with
> the document update. Searches can run concurrently with
> document updates. No changes are needed except for enabling
> the NRT through solrconfig.xml.
>
> RankingAlgorithm query performance is now 3x times faster
> than before and is exposed as the Lucene API. This release
> also adds supports for the last document with a unique id to
> be searchable and visible in search results in case of
> multiple updates of the document.
>
> I have a wiki page that describes NRT performance in detail
> and can be accessed from here:
>
> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>
> You can download Solr 3.3 with RankingAlgorithm (NRT
> version) from here:
>
> http://solr-ra.tgels.org
>
> I would like to invite you to give this version a try as
> the performance is very high.
>
> Regards,
>
> - Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>
>
>