You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@lucene.apache.org by Nagendra Nagarajayya <nn...@transaxtions.com> on 2013/01/27 15:25:04 UTC

[Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available for download now -- includes experimental TimedSerialMergeSchdeduler

Hi:

I am very excited to announce the availability of Apache Solr 3.6.2 with 
RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search 
is very fast NRT and allows you to not only lookup a document by id but 
also allows you to search in realtime, see 
http://tgels.org/realtime-nrt.jsp. The update performance is about 
10,000 docs / sec. The query performance is in ms, allows you to  query 
a 10m wikipedia index (complete index) in <50 ms.

This release also includes a experimental TimedSerialMergeScheduler 
<http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerScheduler-java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp5706350.html> that 
allows you to postpone your merges to off hours time like 11pm or 1am 
increasing performance.

RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ± 
and/or boolean queries.

You can get more information about realtime-search performance from here:
http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x

You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
http://solr-ra.tgels.org

Please download and give the new version a try.

Note:
1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external project.
2. realtime-search has been contributed back to Apache Solr, see 
https://issues.apache.org/jira/browse/SOLR-3816


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://elasticsearch-ra.tgels.org
http://rankingalgorithm.tgels.org


Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available for download now -- includes experimental TimedSerialMergeSchdeduler

Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Hi David:

Please see the new patch that I made available for JIRA:
https://issues.apache.org/jira/browse/SOLR-3816

This removes the code that Yonik had highlighted and also introduces a 
granularity based realtime-search, a request granularity and an 
intra-request granularity. request granularity means that each request 
may return new results based. The underlying code ensures that all the 
components of a request, search, highlighting, faceting, etc. see the 
same view of the index. intra-request granularity means that each 
component may see the changes happening to the index so each may return 
different results ...

request granularity has higher performance compared to intra-request 
granularity. The SolrIndexSearch object is not closed as before. Commit 
or autocommit time can be set to a very high value, the transaction log 
disabled (for use without SolrCloud) for further improvement in 
performance.


Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://elasticsearch-ra.tgels.org
http://rankingalgorithm.tgels.org


On 1/30/2013 7:42 AM, Nagendra Nagarajayya wrote:
> Hi David:
>
> There are no NRT tricks being used. It uses the NRT capable Reader 
> made available  by the IndexWriter.
> (the source is attached to the JiRA)
>
> I would suggest that you download and give this a try. You can 
> download from here:
> http://solr-ra.tgels.org ( you can download the 4.0 or the 3.6.2 
> version )
>
> I have a user who is using this in realtime having indexed closed to 2 
> billion docs with no issues.
> If you do find any problems, please let me know or add onto the 
> existing JIRA so that I can fix it.
>
> Regards,
>
> Nagendra Nagarajayya
> http://solr-ra.tgels.org
> http://elasticsearch-ra.tgels.org
> http://rankingalgorithm.tgels.org
>
>
> On 1/29/2013 8:23 AM, Smiley, David W. wrote:
>> Hi.
>>
>> Speaking for myself anyway, I am leary of using it without it having
>> extensive concurrent tests to validate that the NRT tricks your doing
>> don't have incorrect results.  It would no doubt be very difficult to
>> develop this test.  And this test would be the kind of tests that 
>> runs for
>> a while and you would stop it after running it overnight or whatever
>> duration to make one feel comfortable.
>>
>> ~ David
>>
>> On 1/29/13 8:34 AM, "Nagendra Nagarajayya" 
>> <nn...@transaxtions.com>
>> wrote:
>>
>>> Hi David:
>>>
>>> Did you have a chance to see my comments in the JIRA ?
>>>
>>> Regards,
>>> -NN
>>>
>>> On 1/28/2013 11:58 AM, Smiley, David W. wrote:
>>>> Nagendra,
>>>>
>>>> I'm surprised to see you're still promoting your realtime-search based
>>>> system given the critical problem that Yonik found:
>>>>
>>>> https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815 
>>>>
>>>> &p
>>>>
>>>> age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm 
>>>>
>>>> en
>>>> t-13494815
>>>>
>>>> That is a serious fundamental flaw, I'm sorry to say.  To quote Yonik:
>>>> "You'll get incorrect documents back, incorrect facets back, pretty 
>>>> much
>>>> any number of random looking bugs because internal docids will be
>>>> changing
>>>> underneath you."   This won't necessarily happen all the time 
>>>> depending
>>>> on
>>>> the timing of the search with respect to concurrent changes, but it 
>>>> can
>>>> happen.
>>>>
>>>> ~ David Smiley
>>>>
>>>> On 1/27/13 9:25 AM, "Nagendra Nagarajayya"
>>>> <nn...@transaxtions.com>
>>>> wrote:
>>>>
>>>>> Hi:
>>>>>
>>>>> I am very excited to announce the availability of Apache Solr 3.6.2
>>>>> with
>>>>> RankingAlgorithm30 1.4.3 with realtime-search support. 
>>>>> realtime-search
>>>>> is very fast NRT and allows you to not only lookup a document by 
>>>>> id but
>>>>> also allows you to search in realtime, see
>>>>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>>>>> 10,000 docs / sec. The query performance is in ms, allows you to  
>>>>> query
>>>>> a 10m wikipedia index (complete index) in <50 ms.
>>>>>
>>>>> This release also includes a experimental TimedSerialMergeScheduler
>>>>>
>>>>> <http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerSchedule 
>>>>>
>>>>> r-
>>>>>
>>>>> java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp570 
>>>>>
>>>>> 63
>>>>> 50.html> that
>>>>> allows you to postpone your merges to off hours time like 11pm or 1am
>>>>> increasing performance.
>>>>>
>>>>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>>>>> and/or boolean queries.
>>>>>
>>>>> You can get more information about realtime-search performance from
>>>>> here:
>>>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>>>>
>>>>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>>>>> http://solr-ra.tgels.org
>>>>>
>>>>> Please download and give the new version a try.
>>>>>
>>>>> Note:
>>>>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external
>>>>> project.
>>>>> 2. realtime-search has been contributed back to Apache Solr, see
>>>>> https://issues.apache.org/jira/browse/SOLR-3816
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> Nagendra Nagarajayya
>>>>> http://solr-ra.tgels.org
>>>>> http://elasticsearch-ra.tgels.org
>>>>> http://rankingalgorithm.tgels.org
>>>>>
>>>>
>>
>>
>
>
>


Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available for download now -- includes experimental TimedSerialMergeSchdeduler

Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Hi David:

There are no NRT tricks being used. It uses the NRT capable Reader made 
available  by the IndexWriter.
(the source is attached to the JiRA)

I would suggest that you download and give this a try. You can download 
from here:
http://solr-ra.tgels.org ( you can download the 4.0 or the 3.6.2 version )

I have a user who is using this in realtime having indexed closed to 2 
billion docs with no issues.
If you do find any problems, please let me know or add onto the existing 
JIRA so that I can fix it.

Regards,

Nagendra Nagarajayya
http://solr-ra.tgels.org
http://elasticsearch-ra.tgels.org
http://rankingalgorithm.tgels.org


On 1/29/2013 8:23 AM, Smiley, David W. wrote:
> Hi.
>
> Speaking for myself anyway, I am leary of using it without it having
> extensive concurrent tests to validate that the NRT tricks your doing
> don't have incorrect results.  It would no doubt be very difficult to
> develop this test.  And this test would be the kind of tests that runs for
> a while and you would stop it after running it overnight or whatever
> duration to make one feel comfortable.
>
> ~ David
>
> On 1/29/13 8:34 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
> wrote:
>
>> Hi David:
>>
>> Did you have a chance to see my comments in the JIRA ?
>>
>> Regards,
>> -NN
>>
>> On 1/28/2013 11:58 AM, Smiley, David W. wrote:
>>> Nagendra,
>>>
>>> I'm surprised to see you're still promoting your realtime-search based
>>> system given the critical problem that Yonik found:
>>>
>>> https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815
>>> &p
>>>
>>> age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm
>>> en
>>> t-13494815
>>>
>>> That is a serious fundamental flaw, I'm sorry to say.  To quote Yonik:
>>> "You'll get incorrect documents back, incorrect facets back, pretty much
>>> any number of random looking bugs because internal docids will be
>>> changing
>>> underneath you."   This won't necessarily happen all the time depending
>>> on
>>> the timing of the search with respect to concurrent changes, but it can
>>> happen.
>>>
>>> ~ David Smiley
>>>
>>> On 1/27/13 9:25 AM, "Nagendra Nagarajayya"
>>> <nn...@transaxtions.com>
>>> wrote:
>>>
>>>> Hi:
>>>>
>>>> I am very excited to announce the availability of Apache Solr 3.6.2
>>>> with
>>>> RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>>>> is very fast NRT and allows you to not only lookup a document by id but
>>>> also allows you to search in realtime, see
>>>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>>>> 10,000 docs / sec. The query performance is in ms, allows you to  query
>>>> a 10m wikipedia index (complete index) in <50 ms.
>>>>
>>>> This release also includes a experimental TimedSerialMergeScheduler
>>>>
>>>> <http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerSchedule
>>>> r-
>>>>
>>>> java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp570
>>>> 63
>>>> 50.html> that
>>>> allows you to postpone your merges to off hours time like 11pm or 1am
>>>> increasing performance.
>>>>
>>>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>>>> and/or boolean queries.
>>>>
>>>> You can get more information about realtime-search performance from
>>>> here:
>>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>>>
>>>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>>>> http://solr-ra.tgels.org
>>>>
>>>> Please download and give the new version a try.
>>>>
>>>> Note:
>>>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external
>>>> project.
>>>> 2. realtime-search has been contributed back to Apache Solr, see
>>>> https://issues.apache.org/jira/browse/SOLR-3816
>>>>
>>>>
>>>> Regards,
>>>>
>>>> Nagendra Nagarajayya
>>>> http://solr-ra.tgels.org
>>>> http://elasticsearch-ra.tgels.org
>>>> http://rankingalgorithm.tgels.org
>>>>
>>>
>
>


Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available for download now -- includes experimental TimedSerialMergeSchdeduler

Posted by "Smiley, David W." <ds...@mitre.org>.
Hi.

Speaking for myself anyway, I am leary of using it without it having
extensive concurrent tests to validate that the NRT tricks your doing
don't have incorrect results.  It would no doubt be very difficult to
develop this test.  And this test would be the kind of tests that runs for
a while and you would stop it after running it overnight or whatever
duration to make one feel comfortable.

~ David

On 1/29/13 8:34 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
wrote:

>Hi David:
>
>Did you have a chance to see my comments in the JIRA ?
>
>Regards,
>-NN
>
>On 1/28/2013 11:58 AM, Smiley, David W. wrote:
>> Nagendra,
>>
>> I'm surprised to see you're still promoting your realtime-search based
>> system given the critical problem that Yonik found:
>> 
>>https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815
>>&p
>> 
>>age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comm
>>en
>> t-13494815
>>
>> That is a serious fundamental flaw, I'm sorry to say.  To quote Yonik:
>> "You'll get incorrect documents back, incorrect facets back, pretty much
>> any number of random looking bugs because internal docids will be
>>changing
>> underneath you."   This won't necessarily happen all the time depending
>>on
>> the timing of the search with respect to concurrent changes, but it can
>> happen.
>>
>> ~ David Smiley
>>
>> On 1/27/13 9:25 AM, "Nagendra Nagarajayya"
>><nn...@transaxtions.com>
>> wrote:
>>
>>> Hi:
>>>
>>> I am very excited to announce the availability of Apache Solr 3.6.2
>>>with
>>> RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>>> is very fast NRT and allows you to not only lookup a document by id but
>>> also allows you to search in realtime, see
>>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>>> 10,000 docs / sec. The query performance is in ms, allows you to  query
>>> a 10m wikipedia index (complete index) in <50 ms.
>>>
>>> This release also includes a experimental TimedSerialMergeScheduler
>>> 
>>><http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerSchedule
>>>r-
>>> 
>>>java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp570
>>>63
>>> 50.html> that
>>> allows you to postpone your merges to off hours time like 11pm or 1am
>>> increasing performance.
>>>
>>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>>> and/or boolean queries.
>>>
>>> You can get more information about realtime-search performance from
>>>here:
>>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>>
>>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>>> http://solr-ra.tgels.org
>>>
>>> Please download and give the new version a try.
>>>
>>> Note:
>>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external
>>>project.
>>> 2. realtime-search has been contributed back to Apache Solr, see
>>> https://issues.apache.org/jira/browse/SOLR-3816
>>>
>>>
>>> Regards,
>>>
>>> Nagendra Nagarajayya
>>> http://solr-ra.tgels.org
>>> http://elasticsearch-ra.tgels.org
>>> http://rankingalgorithm.tgels.org
>>>
>>
>>
>


Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available for download now -- includes experimental TimedSerialMergeSchdeduler

Posted by Nagendra Nagarajayya <nn...@transaxtions.com>.
Hi David:

Did you have a chance to see my comments in the JIRA ?

Regards,
-NN

On 1/28/2013 11:58 AM, Smiley, David W. wrote:
> Nagendra,
>
> I'm surprised to see you're still promoting your realtime-search based
> system given the critical problem that Yonik found:
> https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815&p
> age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
> t-13494815
>
> That is a serious fundamental flaw, I'm sorry to say.  To quote Yonik:
> "You'll get incorrect documents back, incorrect facets back, pretty much
> any number of random looking bugs because internal docids will be changing
> underneath you."   This won't necessarily happen all the time depending on
> the timing of the search with respect to concurrent changes, but it can
> happen.
>
> ~ David Smiley
>
> On 1/27/13 9:25 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
> wrote:
>
>> Hi:
>>
>> I am very excited to announce the availability of Apache Solr 3.6.2 with
>> RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>> is very fast NRT and allows you to not only lookup a document by id but
>> also allows you to search in realtime, see
>> http://tgels.org/realtime-nrt.jsp. The update performance is about
>> 10,000 docs / sec. The query performance is in ms, allows you to  query
>> a 10m wikipedia index (complete index) in <50 ms.
>>
>> This release also includes a experimental TimedSerialMergeScheduler
>> <http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerScheduler-
>> java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp57063
>> 50.html> that
>> allows you to postpone your merges to off hours time like 11pm or 1am
>> increasing performance.
>>
>> RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>> and/or boolean queries.
>>
>> You can get more information about realtime-search performance from here:
>> http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>>
>> You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>> http://solr-ra.tgels.org
>>
>> Please download and give the new version a try.
>>
>> Note:
>> 1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external project.
>> 2. realtime-search has been contributed back to Apache Solr, see
>> https://issues.apache.org/jira/browse/SOLR-3816
>>
>>
>> Regards,
>>
>> Nagendra Nagarajayya
>> http://solr-ra.tgels.org
>> http://elasticsearch-ra.tgels.org
>> http://rankingalgorithm.tgels.org
>>
>
>


Re: [Announce] Apache Solr 3.6.2 with RankingAlgorithm 1.4.3 available for download now -- includes experimental TimedSerialMergeSchdeduler

Posted by "Smiley, David W." <ds...@mitre.org>.
Nagendra,

I'm surprised to see you're still promoting your realtime-search based
system given the critical problem that Yonik found:
https://issues.apache.org/jira/browse/SOLR-3816?focusedCommentId=13494815&p
age=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#commen
t-13494815

That is a serious fundamental flaw, I'm sorry to say.  To quote Yonik:
"You'll get incorrect documents back, incorrect facets back, pretty much
any number of random looking bugs because internal docids will be changing
underneath you."   This won't necessarily happen all the time depending on
the timing of the search with respect to concurrent changes, but it can
happen.

~ David Smiley

On 1/27/13 9:25 AM, "Nagendra Nagarajayya" <nn...@transaxtions.com>
wrote:

>Hi:
>
>I am very excited to announce the availability of Apache Solr 3.6.2 with
>RankingAlgorithm30 1.4.3 with realtime-search support. realtime-search
>is very fast NRT and allows you to not only lookup a document by id but
>also allows you to search in realtime, see
>http://tgels.org/realtime-nrt.jsp. The update performance is about
>10,000 docs / sec. The query performance is in ms, allows you to  query
>a 10m wikipedia index (complete index) in <50 ms.
>
>This release also includes a experimental TimedSerialMergeScheduler
><http://rankingalgorithm.1050964.n5.nabble.com/TimedSerialMergerScheduler-
>java-allows-merges-to-be-deferred-to-a-known-time-like-11pm-or-1am-tp57063
>50.html> that 
>allows you to postpone your merges to off hours time like 11pm or 1am
>increasing performance.
>
>RankingAlgorithm30 1.4.3 supports the entire Lucene Query Syntax, ±
>and/or boolean queries.
>
>You can get more information about realtime-search performance from here:
>http://solr-ra.tgels.org/wiki/en/Near_Real_Time_Search_ver3.x
>
>You can download Solr 3.6.2 with RankingAlgorithm30 1.4.3 from here:
>http://solr-ra.tgels.org
>
>Please download and give the new version a try.
>
>Note:
>1. Apache Solr 3.6.2 with RankingAlgorithm30 1.4.3 is an external project.
>2. realtime-search has been contributed back to Apache Solr, see
>https://issues.apache.org/jira/browse/SOLR-3816
>
>
>Regards,
>
>Nagendra Nagarajayya
>http://solr-ra.tgels.org
>http://elasticsearch-ra.tgels.org
>http://rankingalgorithm.tgels.org
>