You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Nagendra Nagarajayya <nn...@transaxtions.com> on 2013/01/27 15:25:04 UTC
[Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available
for download now -- includes experimental TimedSerialMergeSchdeduler
Hi:
I am very excited to announce the availability of Apache Solr 3.6.2 with
RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
is very fast NRT and allows you to not only lookup a document by id but
also allows you to search in realtime, see
http://tgels.org/realtime-nrt.jsp. The update performance is about
10,000 docs / sec. The query performance is in ms, allows you to query
a 10m wikipedia index (complete index) in <50 ms.
This release also includes a experimental TimedSerialMergeScheduler
<http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerScheduler-java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp5706350.html> that
allows you to postpone your merges to off hours time like 11pm or 1am
increasing performance.
RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
and/or boolean queries.
You can get more information about realtime-search performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
http://solr-ra.tgels.org
Please download and give the new version a try.
Note:
1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external project.
2. realtime-search has been contributed back to Apache Solr, see
https://issues.apache.org/jira/browse/SOLR-3816
Regards,
Nagendra Nagarajayya
http://solr-ra.tgels.org
http://elasticsearch-ra.tgels.org
http://rankingalgorithm.tgels.org
Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available
for download now -- includes experimental TimedSerialMergeSchdeduler
Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Hi David:
Please see the new patch that I made available for JIRA:
https://issues.apache.org/jira/browse/SOLR-3816
This removes the code that Yonik had highlighted and also introduces a
granularity based realtime-search, a request granularity and an
intra-request granularity. request granularity means that each request
may return new results based. The underlying code ensures that all the
components of a request, search, highlighting, faceting, etc. see the
same view of the index. intra-request granularity means that each
component may see the changes happening to the index so each may return
different results ...
request granularity has higher performance compared to intra-request
granularity. The SolrIndexSearch object is not closed as before. Commit
or autocommit time can be set to a very high value, the transaction log
disabled (for use without SolrCloud) for further improvement in
performance.
Regards,
Nagendra Nagarajayya
http://solr-ra.tgels.org
http://elasticsearch-ra.tgels.org
http://rankingalgorithm.tgels.org
On 1/30/2013 7:42 AM, Nagendra Nagarajayya wrote:
> Hi David:
>
> There are no NRT tricks being used. It uses the NRT capable Reader
> made available by the IndexWriter.
> (the source is attached to the JiRA)
>
> I would suggest that you download and give this a try. You can
> download from here:
> http://solr-ra.tgels.org ( you can download the 4.0 or the 3.6.2
> version )
>
> I have a user who is using this in realtime having indexed closed to 2
> billion docs with no issues.
> If you do find any problems, please let me know or add onto the
> existing JIRA so that I can fix it.
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://elasticsearch-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>
> On 1/29/2013 8:23 AM, Smiley, David W. wrote:
>> Hi.
>>
>> Speaking for myself anyway, I am leary of using it without it having
>> extensive concurrent tests to validate that the NRT tricks your doing
>> don't have incorrect results. It would no doubt be very difficult to
>> develop this test. And this test would be the kind of tests that
>> runs for
>> a while and you would stop it after running it overnight or whatever
>> duration to make one feel comfortable.
>>
>> ~ David
>>
>> On 1/29/13 8:34 AM, "Nagendra Nagarajayya"
>> <nn...@transaxtions.com>
>> wrote:
>>
>>> Hi David:
>>>
>>> Did you have a chance to see my comments in the JIRA ?
>>>
>>> Regards,
>>> -NN
>>>
>>> On 1/28/2013 11:58 AM, Smiley, David W. wrote:
>>>> Nagendra,
>>>>
>>>> I'm surprised to see you're still promoting your realtime-search based
>>>> system given the critical problem that Yonik found:
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815
>>>>
>>>> &p
>>>>
>>>> age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm
>>>>
>>>> en
>>>> t-13494815
>>>>
>>>> That is a serious fundamental flaw, I'm sorry to say. To quote Yonik:
>>>> "You'll get incorrect documents back, incorrect facets back, pretty
>>>> much
>>>> any number of random looking bugs because internal docids will be
>>>> changing
>>>> underneath you." This won't necessarily happen all the time
>>>> depending
>>>> on
>>>> the timing of the search with respect to concurrent changes, but it
>>>> can
>>>> happen.
>>>>
>>>> ~ David Smiley
>>>>
>>>> On 1/27/13 9:25 AM, "Nagendra Nagarajayya"
>>>> <nn...@transaxtions.com>
>>>> wrote:
>>>>
>>>>> Hi:
>>>>>
>>>>> I am very excited to announce the availability of Apache Solr 3.6.2
>>>>> with
>>>>> RankingAlgorithm30 1.4.3 with realtime-search support.
>>>>> realtime-search
>>>>> is very fast NRT and allows you to not only lookup a document by
>>>>> id but
>>>>> also allows you to search in realtime, see
>>>>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>>>>> 10,000 docs / sec. The query performance is in ms, allows you to
>>>>> query
>>>>> a 10m wikipedia index (complete index) in <50 ms.
>>>>>
>>>>> This release also includes a experimental TimedSerialMergeScheduler
>>>>>
>>>>> <http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerSchedule
>>>>>
>>>>> r-
>>>>>
>>>>> java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp570
>>>>>
>>>>> 63
>>>>> 50.html> that
>>>>> allows you to postpone your merges to off hours time like 11pm or 1am
>>>>> increasing performance.
>>>>>
>>>>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>>>>> and/or boolean queries.
>>>>>
>>>>> You can get more information about realtime-search performance from
>>>>> here:
>>>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>>>>
>>>>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>>>>> http://solr-ra.tgels.org
>>>>>
>>>>> Please download and give the new version a try.
>>>>>
>>>>> Note:
>>>>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external
>>>>> project.
>>>>> 2. realtime-search has been contributed back to Apache Solr, see
>>>>> https://issues.apache.org/jira/browse/SOLR-3816
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Nagendra Nagarajayya
>>>>> http://solr-ra.tgels.org
>>>>> http://elasticsearch-ra.tgels.org
>>>>> http://rankingalgorithm.tgels.org
>>>>>
>>>>
>>
>>
>
>
>
Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available
for download now -- includes experimental TimedSerialMergeSchdeduler
Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Hi David:
There are no NRT tricks being used. It uses the NRT capable Reader made
available by the IndexWriter.
(the source is attached to the JiRA)
I would suggest that you download and give this a try. You can download
from here:
http://solr-ra.tgels.org ( you can download the 4.0 or the 3.6.2 version )
I have a user who is using this in realtime having indexed closed to 2
billion docs with no issues.
If you do find any problems, please let me know or add onto the existing
JIRA so that I can fix it.
Regards,
Nagendra Nagarajayya
http://solr-ra.tgels.org
http://elasticsearch-ra.tgels.org
http://rankingalgorithm.tgels.org
On 1/29/2013 8:23 AM, Smiley, David W. wrote:
> Hi.
>
> Speaking for myself anyway, I am leary of using it without it having
> extensive concurrent tests to validate that the NRT tricks your doing
> don't have incorrect results. It would no doubt be very difficult to
> develop this test. And this test would be the kind of tests that runs for
> a while and you would stop it after running it overnight or whatever
> duration to make one feel comfortable.
>
> ~ David
>
> On 1/29/13 8:34 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
> wrote:
>
>> Hi David:
>>
>> Did you have a chance to see my comments in the JIRA ?
>>
>> Regards,
>> -NN
>>
>> On 1/28/2013 11:58 AM, Smiley, David W. wrote:
>>> Nagendra,
>>>
>>> I'm surprised to see you're still promoting your realtime-search based
>>> system given the critical problem that Yonik found:
>>>
>>> https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815
>>> &p
>>>
>>> age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm
>>> en
>>> t-13494815
>>>
>>> That is a serious fundamental flaw, I'm sorry to say. To quote Yonik:
>>> "You'll get incorrect documents back, incorrect facets back, pretty much
>>> any number of random looking bugs because internal docids will be
>>> changing
>>> underneath you." This won't necessarily happen all the time depending
>>> on
>>> the timing of the search with respect to concurrent changes, but it can
>>> happen.
>>>
>>> ~ David Smiley
>>>
>>> On 1/27/13 9:25 AM, "Nagendra Nagarajayya"
>>> <nn...@transaxtions.com>
>>> wrote:
>>>
>>>> Hi:
>>>>
>>>> I am very excited to announce the availability of Apache Solr 3.6.2
>>>> with
>>>> RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>>>> is very fast NRT and allows you to not only lookup a document by id but
>>>> also allows you to search in realtime, see
>>>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>>>> 10,000 docs / sec. The query performance is in ms, allows you to query
>>>> a 10m wikipedia index (complete index) in <50 ms.
>>>>
>>>> This release also includes a experimental TimedSerialMergeScheduler
>>>>
>>>> <http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerSchedule
>>>> r-
>>>>
>>>> java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp570
>>>> 63
>>>> 50.html> that
>>>> allows you to postpone your merges to off hours time like 11pm or 1am
>>>> increasing performance.
>>>>
>>>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>>>> and/or boolean queries.
>>>>
>>>> You can get more information about realtime-search performance from
>>>> here:
>>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>>>
>>>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>>>> http://solr-ra.tgels.org
>>>>
>>>> Please download and give the new version a try.
>>>>
>>>> Note:
>>>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external
>>>> project.
>>>> 2. realtime-search has been contributed back to Apache Solr, see
>>>> https://issues.apache.org/jira/browse/SOLR-3816
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Nagendra Nagarajayya
>>>> http://solr-ra.tgels.org
>>>> http://elasticsearch-ra.tgels.org
>>>> http://rankingalgorithm.tgels.org
>>>>
>>>
>
>
Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3
available for download now -- includes experimental
TimedSerialMergeSchdeduler
Posted by "Smiley, David W." <ds...@mitre.org>.
Hi.
Speaking for myself anyway, I am leary of using it without it having
extensive concurrent tests to validate that the NRT tricks your doing
don't have incorrect results. It would no doubt be very difficult to
develop this test. And this test would be the kind of tests that runs for
a while and you would stop it after running it overnight or whatever
duration to make one feel comfortable.
~ David
On 1/29/13 8:34 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
wrote:
>Hi David:
>
>Did you have a chance to see my comments in the JIRA ?
>
>Regards,
>-NN
>
>On 1/28/2013 11:58 AM, Smiley, David W. wrote:
>> Nagendra,
>>
>> I'm surprised to see you're still promoting your realtime-search based
>> system given the critical problem that Yonik found:
>>
>>https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815
>>&p
>>
>>age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm
>>en
>> t-13494815
>>
>> That is a serious fundamental flaw, I'm sorry to say. To quote Yonik:
>> "You'll get incorrect documents back, incorrect facets back, pretty much
>> any number of random looking bugs because internal docids will be
>>changing
>> underneath you." This won't necessarily happen all the time depending
>>on
>> the timing of the search with respect to concurrent changes, but it can
>> happen.
>>
>> ~ David Smiley
>>
>> On 1/27/13 9:25 AM, "Nagendra Nagarajayya"
>><nn...@transaxtions.com>
>> wrote:
>>
>>> Hi:
>>>
>>> I am very excited to announce the availability of Apache Solr 3.6.2
>>>with
>>> RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>>> is very fast NRT and allows you to not only lookup a document by id but
>>> also allows you to search in realtime, see
>>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>>> 10,000 docs / sec. The query performance is in ms, allows you to query
>>> a 10m wikipedia index (complete index) in <50 ms.
>>>
>>> This release also includes a experimental TimedSerialMergeScheduler
>>>
>>><http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerSchedule
>>>r-
>>>
>>>java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp570
>>>63
>>> 50.html> that
>>> allows you to postpone your merges to off hours time like 11pm or 1am
>>> increasing performance.
>>>
>>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>>> and/or boolean queries.
>>>
>>> You can get more information about realtime-search performance from
>>>here:
>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>>
>>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>>> http://solr-ra.tgels.org
>>>
>>> Please download and give the new version a try.
>>>
>>> Note:
>>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external
>>>project.
>>> 2. realtime-search has been contributed back to Apache Solr, see
>>> https://issues.apache.org/jira/browse/SOLR-3816
>>>
>>>
>>> Regards,
>>>
>>> Nagendra Nagarajayya
>>> http://solr-ra.tgels.org
>>> http://elasticsearch-ra.tgels.org
>>> http://rankingalgorithm.tgels.org
>>>
>>
>>
>
Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available
for download now -- includes experimental TimedSerialMergeSchdeduler
Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Hi David:
Did you have a chance to see my comments in the JIRA ?
Regards,
-NN
On 1/28/2013 11:58 AM, Smiley, David W. wrote:
> Nagendra,
>
> I'm surprised to see you're still promoting your realtime-search based
> system given the critical problem that Yonik found:
> https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815&p
> age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
> t-13494815
>
> That is a serious fundamental flaw, I'm sorry to say. To quote Yonik:
> "You'll get incorrect documents back, incorrect facets back, pretty much
> any number of random looking bugs because internal docids will be changing
> underneath you." This won't necessarily happen all the time depending on
> the timing of the search with respect to concurrent changes, but it can
> happen.
>
> ~ David Smiley
>
> On 1/27/13 9:25 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
> wrote:
>
>> Hi:
>>
>> I am very excited to announce the availability of Apache Solr 3.6.2 with
>> RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>> is very fast NRT and allows you to not only lookup a document by id but
>> also allows you to search in realtime, see
>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>> 10,000 docs / sec. The query performance is in ms, allows you to query
>> a 10m wikipedia index (complete index) in <50 ms.
>>
>> This release also includes a experimental TimedSerialMergeScheduler
>> <http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerScheduler-
>> java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp57063
>> 50.html> that
>> allows you to postpone your merges to off hours time like 11pm or 1am
>> increasing performance.
>>
>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>> and/or boolean queries.
>>
>> You can get more information about realtime-search performance from here:
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>
>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>> http://solr-ra.tgels.org
>>
>> Please download and give the new version a try.
>>
>> Note:
>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external project.
>> 2. realtime-search has been contributed back to Apache Solr, see
>> https://issues.apache.org/jira/browse/SOLR-3816
>>
>>
>> Regards,
>>
>> Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://elasticsearch-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>
>
Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3
available for download now -- includes experimental
TimedSerialMergeSchdeduler
Posted by "Smiley, David W." <ds...@mitre.org>.
Nagendra,
I'm surprised to see you're still promoting your realtime-search based
system given the critical problem that Yonik found:
https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815&p
age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
t-13494815
That is a serious fundamental flaw, I'm sorry to say. To quote Yonik:
"You'll get incorrect documents back, incorrect facets back, pretty much
any number of random looking bugs because internal docids will be changing
underneath you." This won't necessarily happen all the time depending on
the timing of the search with respect to concurrent changes, but it can
happen.
~ David Smiley
On 1/27/13 9:25 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
wrote:
>Hi:
>
>I am very excited to announce the availability of Apache Solr 3.6.2 with
>RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>is very fast NRT and allows you to not only lookup a document by id but
>also allows you to search in realtime, see
>http://tgels.org/realtime-nrt.jsp. The update performance is about
>10,000 docs / sec. The query performance is in ms, allows you to query
>a 10m wikipedia index (complete index) in <50 ms.
>
>This release also includes a experimental TimedSerialMergeScheduler
><http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerScheduler-
>java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp57063
>50.html> that
>allows you to postpone your merges to off hours time like 11pm or 1am
>increasing performance.
>
>RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>and/or boolean queries.
>
>You can get more information about realtime-search performance from here:
>http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>
>You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>http://solr-ra.tgels.org
>
>Please download and give the new version a try.
>
>Note:
>1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external project.
>2. realtime-search has been contributed back to Apache Solr, see
>https://issues.apache.org/jira/browse/SOLR-3816
>
>
>Regards,
>
>Nagendra Nagarajayya
>http://solr-ra.tgels.org
>http://elasticsearch-ra.tgels.org
>http://rankingalgorithm.tgels.org
>