You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by Robert Boulanger <ro...@boulanger.at> on 2006/12/03 10:58:10 UTC

Re: Remote searching with Lucene - forward progress

Hi Jeff,

concerning the message thread below which I began in August this year, I 
wonder if there is any progress on your side so far.
Maybe I missed something in the mailinglist (what I expect), since I was 
busy with other stuff,  but the last note from you concerning remote 
search I find here was from september 13th.
So, since I'm on this topic again, I just want to know, whether you 
released anything in the past months what I'm just not seeing or if you 
are still on the issue you are describing in your last note.
thanks for replying

best regards

--Robert



Jeff Rodenburg schrieb:
> An update on the Remote Searching project I'm bringing forward.  I've
> completed the base code for hand-off to the community.  I'm presently
> working through a remoting/serialization issue that's popped up recently.
> This appears to be something new in the Lucene 2.0 release.  I'm working
> through that issue now, but I haven no expectation of when that's 
> resolved.
>
> Rather than release a non-working system, I'm going to resolve this 
> problem
> first.  Once things are working appropriately, I'll send out a release
> message.
>
> Thanks and if you have remoting experience and suggestions, feel free to
> ping me.  :-)
>
> cheers,
> jeff r.
>
>
> On 9/7/06, Jeff Rodenburg <je...@gmail.com> wrote:
>>
>> All -
>>
>> Another update on the remote searching application code that's been
>> mentioned in this thread.  I'm near completion of the entire 
>> collection of
>> files that are needed for this project -- libraries, applications, unit
>> tests, and documentation.  There's quite a bit to this, and thanks for
>> everybody's patience as I assemble the code into something that's 
>> less than
>> confusing.  There are several working pieces, so I'm packaging it for
>> consumption.
>>
>> I expect to have this available sometime in the next few days, barring
>> things like my life and regular job from getting in the way.  Again, 
>> I'll
>> share an announcement to the list when I've made the files available.
>>
>> Thanks,
>> jeff r.
>>
>>
>>
>> On 8/26/06, Jeff Rodenburg <je...@gmail.com> wrote:
>> >
>> > As promised, an update to the list.
>> >
>> > I have code ready for delivery, if I can get svn access to the contrib
>> > section.  A request has been made for this but it's going nowhere, 
>> so I'm
>> > going to find another place to host the files.
>> >
>> > There's quite a bit of documentation behind this so I'm working
>> > diligently to explain how this works.  If anyone has a place to 
>> hold the
>> > code until the uber-powers at apache decide to grant me access, we 
>> would
>> > greatly appreciate the assistance.
>> >
>> > cheers,
>> > jeff r.
>> >
>> >
>> >
>> > On 8/23/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>> > >
>> > > Just a follow-up to everyone on this topic.  I received a lot of
>> > > offlist mail about this, so this message has a rather wide 
>> distribution.
>> > >
>> > > I'm in process of modifying the code for our distributed search
>> > > components so that they're generic enough for general usage and 
>> public
>> > > consumption.  This is taking a little of my time, but nonetheless 
>> I expect
>> > > to complete it soon.
>> > >
>> > > As for distributing the code, it will be located in the contrib
>> > > portion of the Lucene.Net repository at apache.org .  There is some
>> > > logistic work involved, but ideally this is moving forward.
>> > >
>> > > As soon as I have more information to relay, I'll pass it along 
>> to the
>> > > list.
>> > >
>> > > cheers,
>> > > jeff r.
>> > >
>> > >
>> > >
>> > >
>> > > On 8/21/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>> > > >
>> > > > Hello all -
>> > > >
>> > > > I've been watching this thread to follow the direction and 
>> thought I
>> > > > might be able to offer some assistance.  I run a search system 
>> that involves
>> > > > 4 separate search servers -- 3 serving search objects via 
>> RemoteSearchable,
>> > > > and a 4th that serves in an index updating role.
>> > > >
>> > > > The codebase for Lucene.Net provides all the library routines one
>> > > > needs to provide distributed search capabilities, but does not 
>> provide
>> > > > facilities for distributed search operation -- nor should it.  
>> The ideas
>> > > > presented here are certainly possible; I've implemented a 
>> working operation
>> > > > without requiring the changes described here.  I'm confident in 
>> our
>> > > > implementation; for the calendar year, our uptime/availability 
>> of search
>> > > > services is 99.99%.  Our only outage was related to network
>> > > > hardware, otherwise we're sitting solid at 100%.
>> > > >
>> > > > I've been authorized to provide our operational code for 
>> distributed
>> > > > search under Lucene.Net to the community at large.  Some of the 
>> code
>> > > > is customized to our operation, but for the most part it's 
>> rather generic.
>> > > > We started the project under Lucene v1.4.3, but the operational
>> > > > aspect still applies under v1.9.
>> > > >
>> > > > The system consists of a LuceneServer, which provides 
>> searchability
>> > > > against indexes as defined in XML configuration files.  In 
>> addition, an
>> > > > IndexUpdateServer provides master index updating, master/slave 
>> index
>> > > > replication and automated index maintenance.  Integration with 
>> our web site
>> > > > ensures the index stays available, updated and current.  
>> There's a great
>> > > > deal of applied knowledge and learned behavior of many of the 
>> underlying
>> > > > sub-system components that distributed search under Lucene.Net 
>> makes
>> > > > use of -- .Net remoting, garbage collection, etc.
>> > > >
>> > > > If anyone has interest, please reply.  Contributing this code
>> > > > requires a little cleanup of our customization work, so my 
>> response may not
>> > > > be immediate but I would make efforts to release the code in 
>> short order.
>> > > >
>> > > > thanks,
>> > > > jeff r.
>> > > >
>> > > >
>> > > >
>> > > >
>> > > > On 8/19/06, Robert Boulanger < robert@boulanger.at> wrote:
>> > > > >
>> > > > > Hi Elena, hi Rest,
>> > > > >
>> > > > > > Dear All,
>> > > > > >
>> > > > > > The application I am working on is intended to make use of the
>> > > > > > distributed search capabilities of the Lucene library. While
>> > > > > trying to
>> > > > > > work with the Lucene's RemoteSearchable class, I faced some
>> > > > > problems
>> > > > > > cased by the current Lucene implementation. In following I'll
>> > > > > try to
>> > > > > > describe them, as well as the possible ways of their 
>> solution, I
>> > > > > > identified. The most important question for me is, if these
>> > > > > changes
>> > > > > > have a chance to be integrated in the coming Lucene versions,
>> > > > > such
>> > > > > > that remote searches would really become feasible. I would
>> > > > > appreciate
>> > > > > > any feedback.
>> > > > >
>> > > > > Same problem for me and I found some more issues which I explain
>> > > > > below:
>> > > > >
>> > > > > >
>> > > > > > The first problem concerns the construction of the
>> > > > > RemoteSearchable
>> > > > > > object. .Net framework allows for both, server and client
>> > > > > activation
>> > > > > > models of the remote objects. Currently, RemoteSearchable 
>> class
>> > > > > > possesses only one constructor that requires knowledge of a
>> > > > > local
>> > > > > > Searchable object:
>> > > > > >
>> > > > > > public RemoteSearchable(Lucene.Net.Search.Searchable local)
>> > > > > >
>> > > > > I just added a new constructor to RemoteSearchable
>> > > > > public RemoteSearchable(): base()
>> > > > > {
>> > > > > this.local = this.local;
>> > > > > }
>> > > > >
>> > > > > not the fine method but for me it works so far.
>> > > > >
>> > > > > > Since this "local" object is located on the server, 
>> knowledge of
>> > > > > the
>> > > > > > server's index paths is needed for its creation. However, 
>> there
>> > > > > are at
>> > > > > > least some scenarios where only the server, but not the 
>> client,
>> > > > > knows
>> > > > > > where the indexes are stored on the server side. I think this
>> > > > > problem
>> > > > > > could be solved by extending RemoteSearchable class with a
>> > > > > standard
>> > > > > > constructor that reads the names of the indexes to be 
>> published
>> > > > > out of
>> > > > > > a configuration file on the server side.
>> > > > > >
>> > > > > My "Server" now implements a Class which inherits directly from
>> > > > > Remote
>> > > > > Searchable.
>> > > > > in the parameterless constructor there I read the server sided
>> > > > > configfile which contains the index location , create a new
>> > > > > IndexReader
>> > > > > and pass it as Argument to MyBase.New()
>> > > > > See sample below.
>> > > > >
>> > > > > > 2. Bug in Term construction
>> > > > > [snip]
>> > > > >
>> > > > > This whole chapter was very useful and I can commit everything
>> > > > > works
>> > > > > fine from there on.
>> > > > >
>> > > > > But there is still a bug in FieldDocSortedHitQueue line 130 and
>> > > > > below:
>> > > > > I figured out that the castings are not working when the 
>> system is
>> > > > > running in a non english globalization context.
>> > > > > The String in docAFields[i] which might be for example 
>> 1.345678 is
>> > > > > casted to 1345678.0 since the decimal sign is misinterpreted in
>> > > > > German
>> > > > > systems as it seems.
>> > > > > So the casting results in an overflow.
>> > > > >
>> > > > > So I changed it as follows:
>> > > > >
>> > > > > case SortField.SCORE:
>> > > > > float r1 = (float)Convert.ToSingle(docA.fields[i],
>> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
>> > > > > float r2 = (float)Convert.ToSingle(docA.fields[i],
>> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
>> > > > > if (r1 > r2)
>> > > > > c = - 1;
>> > > > > if (r1 < r2)
>> > > > > c = 1;
>> > > > > break;
>> > > > >
>> > > > > Same in line 172 and 174:
>> > > > >
>> > > > > float f1 = (float)Convert.ToSingle(docA.fields[i],
>> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
>> > > > > //UPGRADE_TODO: The equivalent in .NET for method
>> > > > > 'java.lang.Float.floatValue' may return a different value.
>> > > > >
>> > > > > 
>> "ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?index='!DefaultContextWindowIndex'&keyword='jlca1043'" 
>>
>> > > > > float f2 = (float)Convert.ToSingle(docB.fields[i],
>> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
>> > > > >
>> > > > >
>> > > > >
>> > > > > A tiny Client Server Solution now looks like this (Here in 
>> VB.NET)
>> > > > > SERVER:
>> > > > > Public Class RemoteQuery
>> > > > > Inherits RemoteSearchable
>> > > > > Public Sub New()
>> > > > > MyBase.New(New IndexSearcher("C:\lucene\index"))
>> > > > > End Sub
>> > > > > Public Sub New(ByVal local As Searchable)
>> > > > > MyBase.New(local)
>> > > > > End Sub
>> > > > >
>> > > > > End Class
>> > > > >
>> > > > > Module Module1
>> > > > > Public Sub Main(ByVal args As System.String())
>> > > > > Dim chnl As New HttpChannel(8888)
>> > > > > ChannelServices.RegisterChannel (chnl, False)
>> > > > > Dim indexName As System.String = Nothing
>> > > > > RemotingConfiguration.RegisterWellKnownServiceType
>> > > > > (GetType(RemoteQuery),
>> > > > > "Searchable", WellKnownObjectMode.Singleton)
>> > > > > System.Console.ReadLine()
>> > > > > End Sub
>> > > > > End Module
>> > > > > CLIENT
>> > > > > Sub Main()
>> > > > > Dim searchables As Lucene.Net.Search.Searchable() = New
>> > > > > Lucene.Net.Search.Searchable() {LookupRemote()}
>> > > > > Dim searcher As Searcher = New MultiSearcher(searchables)
>> > > > > Dim sort As New Lucene.Net.Search.Sort
>> > > > > sort.SetSort(Lucene.Net.Search.SortField.FIELD_SCORE)
>> > > > > Dim query As Query = QueryParser.Parse("Harry", "body", New
>> > > > > StandardAnalyzer())
>> > > > > Dim result As Hits = searcher.Search (query, sort)
>> > > > > End Sub
>> > > > > Private Function LookupRemote() As Lucene.Net.Search.Searchable
>> > > > > Return CType(Activator.GetObject(GetType(
>> > > > > Lucene.Net.Search.Searchable),
>> > > > > " http://192.168.8.7:8888/Searchable"),
>> > > > > Lucene.Net.Search.Searchable)
>> > > > > End Function
>> > > > >
>> > > > > Hope this helps you and anybody else how has problems with
>> > > > > remotesearch
>> > > > > so far.
>> > > > >
>> > > > > BTW: this all refers Version 1.9rc1
>> > > > >
>> > > > > --Robert Boulanger
>> > > > >
>> > > >
>> > > >
>> > >
>> >
>>
>



Re: Remote searching with Lucene - forward progress

Posted by Robert Boulanger <ro...@boulanger.at>.
Hi George,

Im answering inline:

>
> Sorry, but I am not understanding what's going on here.  What modification
> you are referring to that "Elena" and you made?  Was there some private
> email exchange?
>
>   
You see Elena's suggestions in

lucene-net-dev Digest of: get.163

I answered 8 days after that poining out a FieldDocSortedHitQueue issue in non english environments and a fix for that, which was also the start of this thread.
I also added some lines of VB.NET Code to test the suggestions there.
To keep it easy for you to check this again, I attached both messages here, since scrolling down to the bottom of this message makes it nearly unreadable through the uncounted quotes and indentations here.



> In any case, one other option there is to provide remote searching with
> Lucene.Net is to port the existing solution in 1.4 to 2.0 (or maybe even
> 1.9.1)  If you or some has the cycles and want to take on this task, let us
> know and go for it.
>
> Sorting works with MultiSearcher.  Make sure you are using the latest
> release of 1.9.1 or 2.0 ("final" in both cases.)
>   
I have the latest 1.9.1 here but it does not work for me. I will check 
this again and provide a sample showing this issue in the case I can't 
find any mistake I have done here.
> I can't tell you much about Lucene.Net and WAN since I have not used it (I
> don't have a need for it, yet.)  Since you say you have written a solution,
> and it sounds like a good one, can you contribute it to ASP / Lucene.Net?
> If you can do so, make sure you have the appropriate ASF copyright message
> on each file, a README.TXT file, a sample / demo and if possible an NUnit
> test for the code.
>   
As mentioned in my message before, I can do this. Currently this code is 
in an active production test, covering 8 huge fileservers, about 100 
indices and about 2TB of indexed data, which are located in Europe, Asia 
and the US and connected via  VPN-Tunnels.
I expect that I have to optimize and fix some stuff during the test 
phase which is scheduled to run until end of february.
After the tests are finished and the framework is to be considered as 
stable I will find the time to provide a reusable solution with some 
samples.

Regards

Robert

> Regards,
>
> -- George Aroush
>
>
> -----Original Message-----
> From: Robert Boulanger [mailto:robert@boulanger.at] 
> Sent: Tuesday, January 02, 2007 6:19 PM
> To: lucene-net-dev@incubator.apache.org
> Subject: Re: Remote searching with Lucene - forward progress
>
> Hi Jeff,
>
> thanks for the update.
> Here the status from my side so far:
>
> I worked until I dropped the last message sucessfully with the modifications
> Elena and I described before. I did nothing else since I waited and hoped
> for any other progress from other sides, but wondered why the suggested
> fixes never went into the releases of 1.9.
> Anyhow, an other issue I  found is that the sorting seems not to work
> correctly when using remote searching features. (And maybe when using
> MultiSearcher in general) So it looks like each index is sorted, but not the
> hits collection of the multisearcher itself.
> But the major issue I found was, that remote searches over a WAN, means
> Inernet or a VPN for example takes about 100 time so long as the same query
> within a LAN. ( means 7 seconds instead of 0.07 secs). So I think the Lucene
> Remote Query relays on heavy bidirectional Network Traffic, means not
> transporting a lot of data, but a lot of single calls which makes it slow in
> a WAN Environment.
>
> Therefore I wrote my own Client Server Wrapper for this which does things in
> a single call to each remote index, and which is possible now also again
> with Lucene 1.3 if necessary.
> I'm also able to do this in a cascading way, means each queryserver can be
> configured to forward the query to other servers and they again, and so on,
> and so on. hereby is ensured that endless loops are not possible (Server a
> calls b which calls again a) and the API allows the passing of a parameter
> which defines how deep (in the hierarchy of configured
> servers)  the search should be forwarded. The end result again has correct
> sorting. I also don't use any multisearchers here, just normal indexreaders.
>
> The whole architecture has nothing to do with Lucene itself, except the fact
> that Lucene is used for searching, but if anybody has interest in this, let
> me know, I can build a template or example how to do this and post it
> anywhere.
>
> Cheers
>
> Robert
>
>
> Jeff Rodenburg schrieb:
>   
>> Hi Robert, et. al -
>>
>> No, I've not missed updating the list.  I've been a bit busy with 
>> other things but have been working to resolve some serialization 
>> issues that are down in the core of .Net Remoting.  The Lucene 2.0 
>> codebase has been problematic inside of the remoting architecture.  
>> Rather than continue to update the list with notifications about a 
>> lack of progress, I've opted to attempt to address those issues and 
>> make an announcement when I'd reached success.
>>
>> So, no news for now.
>>
>> thanks,
>> jeff
>>
>> On 12/3/06, Robert Boulanger <ro...@boulanger.at> wrote:
>>     
>>> Hi Jeff,
>>>
>>> concerning the message thread below which I began in August this 
>>> year, I wonder if there is any progress on your side so far.
>>> Maybe I missed something in the mailinglist (what I expect), since I 
>>> was busy with other stuff,  but the last note from you concerning 
>>> remote search I find here was from september 13th.
>>> So, since I'm on this topic again, I just want to know, whether you 
>>> released anything in the past months what I'm just not seeing or if 
>>> you are still on the issue you are describing in your last note.
>>> thanks for replying
>>>
>>> best regards
>>>
>>> --Robert
>>>
>>>
>>>
>>> Jeff Rodenburg schrieb:
>>>       
>>>> An update on the Remote Searching project I'm bringing forward.  
>>>> I've completed the base code for hand-off to the community.  I'm 
>>>> presently working through a remoting/serialization issue that's 
>>>> popped up
>>>>         
>>> recently.
>>>       
>>>> This appears to be something new in the Lucene 2.0 release.  I'm
>>>>         
>>> working
>>>       
>>>> through that issue now, but I haven no expectation of when that's 
>>>> resolved.
>>>>
>>>> Rather than release a non-working system, I'm going to resolve this 
>>>> problem first.  Once things are working appropriately, I'll send 
>>>> out a release message.
>>>>
>>>> Thanks and if you have remoting experience and suggestions, feel
>>>>         
>>> free to
>>>       
>>>> ping me.  :-)
>>>>
>>>> cheers,
>>>> jeff r.
>>>>
>>>>
>>>> On 9/7/06, Jeff Rodenburg <je...@gmail.com> wrote:
>>>>         
>>>>> All -
>>>>>
>>>>> Another update on the remote searching application code that's 
>>>>> been mentioned in this thread.  I'm near completion of the entire 
>>>>> collection of files that are needed for this project -- libraries, 
>>>>> applications,
>>>>>           
>>> unit
>>>       
>>>>> tests, and documentation.  There's quite a bit to this, and thanks
>>>>>           
>>> for
>>>       
>>>>> everybody's patience as I assemble the code into something that's 
>>>>> less than confusing.  There are several working pieces, so I'm 
>>>>> packaging it for consumption.
>>>>>
>>>>> I expect to have this available sometime in the next few days,
>>>>>           
>>> barring
>>>       
>>>>> things like my life and regular job from getting in the way.  
>>>>> Again, I'll share an announcement to the list when I've made the 
>>>>> files available.
>>>>>
>>>>> Thanks,
>>>>> jeff r.
>>>>>
>>>>>
>>>>>
>>>>> On 8/26/06, Jeff Rodenburg <je...@gmail.com> wrote:
>>>>>           
>>>>>> As promised, an update to the list.
>>>>>>
>>>>>> I have code ready for delivery, if I can get svn access to the
>>>>>>             
>>> contrib
>>>       
>>>>>> section.  A request has been made for this but it's going 
>>>>>> nowhere,
>>>>>>             
>>>>> so I'm
>>>>>           
>>>>>> going to find another place to host the files.
>>>>>>
>>>>>> There's quite a bit of documentation behind this so I'm working 
>>>>>> diligently to explain how this works.  If anyone has a place to
>>>>>>             
>>>>> hold the
>>>>>           
>>>>>> code until the uber-powers at apache decide to grant me access, 
>>>>>> we
>>>>>>             
>>>>> would
>>>>>           
>>>>>> greatly appreciate the assistance.
>>>>>>
>>>>>> cheers,
>>>>>> jeff r.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 8/23/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>>>>>>             
>>>>>>> Just a follow-up to everyone on this topic.  I received a lot 
>>>>>>> of offlist mail about this, so this message has a rather wide
>>>>>>>               
>>>>> distribution.
>>>>>           
>>>>>>> I'm in process of modifying the code for our distributed 
>>>>>>> search components so that they're generic enough for general 
>>>>>>> usage and
>>>>>>>               
>>>>> public
>>>>>           
>>>>>>> consumption.  This is taking a little of my time, but 
>>>>>>> nonetheless
>>>>>>>               
>>>>> I expect
>>>>>           
>>>>>>> to complete it soon.
>>>>>>>
>>>>>>> As for distributing the code, it will be located in the 
>>>>>>> contrib portion of the Lucene.Net repository at apache.org .  
>>>>>>> There is
>>>>>>>               
>>> some
>>>       
>>>>>>> logistic work involved, but ideally this is moving forward.
>>>>>>>
>>>>>>> As soon as I have more information to relay, I'll pass it 
>>>>>>> along
>>>>>>>               
>>>>> to the
>>>>>           
>>>>>>> list.
>>>>>>>
>>>>>>> cheers,
>>>>>>> jeff r.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 8/21/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>>>>>>>               
>>>>>>>> Hello all -
>>>>>>>>
>>>>>>>> I've been watching this thread to follow the direction and
>>>>>>>>                 
>>>>> thought I
>>>>>           
>>>>>>>> might be able to offer some assistance.  I run a search 
>>>>>>>> system
>>>>>>>>                 
>>>>> that involves
>>>>>           
>>>>>>>> 4 separate search servers -- 3 serving search objects via
>>>>>>>>                 
>>>>> RemoteSearchable,
>>>>>           
>>>>>>>> and a 4th that serves in an index updating role.
>>>>>>>>
>>>>>>>> The codebase for Lucene.Net provides all the library
>>>>>>>>                 
>>> routines one
>>>       
>>>>>>>> needs to provide distributed search capabilities, but does 
>>>>>>>> not
>>>>>>>>                 
>>>>> provide
>>>>>           
>>>>>>>> facilities for distributed search operation -- nor should it.
>>>>>>>>                 
>>>>> The ideas
>>>>>           
>>>>>>>> presented here are certainly possible; I've implemented a
>>>>>>>>                 
>>>>> working operation
>>>>>           
>>>>>>>> without requiring the changes described here.  I'm confident 
>>>>>>>> in
>>>>>>>>                 
>>>>> our
>>>>>           
>>>>>>>> implementation; for the calendar year, our 
>>>>>>>> uptime/availability
>>>>>>>>                 
>>>>> of search
>>>>>           
>>>>>>>> services is 99.99%.  Our only outage was related to network 
>>>>>>>> hardware, otherwise we're sitting solid at 100%.
>>>>>>>>
>>>>>>>> I've been authorized to provide our operational code for
>>>>>>>>                 
>>>>> distributed
>>>>>           
>>>>>>>> search under Lucene.Net to the community at large.  Some of 
>>>>>>>> the
>>>>>>>>                 
>>>>> code
>>>>>           
>>>>>>>> is customized to our operation, but for the most part it's
>>>>>>>>                 
>>>>> rather generic.
>>>>>           
>>>>>>>> We started the project under Lucene v1.4.3, but the 
>>>>>>>> operational aspect still applies under v1.9.
>>>>>>>>
>>>>>>>> The system consists of a LuceneServer, which provides
>>>>>>>>                 
>>>>> searchability
>>>>>           
>>>>>>>> against indexes as defined in XML configuration files.  In
>>>>>>>>                 
>>>>> addition, an
>>>>>           
>>>>>>>> IndexUpdateServer provides master index updating, 
>>>>>>>> master/slave
>>>>>>>>                 
>>>>> index
>>>>>           
>>>>>>>> replication and automated index maintenance.  Integration 
>>>>>>>> with
>>>>>>>>                 
>>>>> our web site
>>>>>           
>>>>>>>> ensures the index stays available, updated and current.
>>>>>>>>                 
>>>>> There's a great
>>>>>           
>>>>>>>> deal of applied knowledge and learned behavior of many of 
>>>>>>>> the
>>>>>>>>                 
>>>>> underlying
>>>>>           
>>>>>>>> sub-system components that distributed search under 
>>>>>>>> Lucene.Net
>>>>>>>>                 
>>>>> makes
>>>>>           
>>>>>>>> use of -- .Net remoting, garbage collection, etc.
>>>>>>>>
>>>>>>>> If anyone has interest, please reply.  Contributing this 
>>>>>>>> code requires a little cleanup of our customization work, so 
>>>>>>>> my
>>>>>>>>                 
>>>>> response may not
>>>>>           
>>>>>>>> be immediate but I would make efforts to release the code in
>>>>>>>>                 
>>>>> short order.
>>>>>           
>>>>>>>> thanks,
>>>>>>>> jeff r.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 8/19/06, Robert Boulanger < robert@boulanger.at> wrote:
>>>>>>>>                 
>>>>>>>>> Hi Elena, hi Rest,
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> Dear All,
>>>>>>>>>>
>>>>>>>>>> The application I am working on is intended to make use 
>>>>>>>>>> of
>>>>>>>>>>                     
>>> the
>>>       
>>>>>>>>>> distributed search capabilities of the Lucene library. 
>>>>>>>>>>                     
>>> While
>>>       
>>>>>>>>> trying to
>>>>>>>>>                   
>>>>>>>>>> work with the Lucene's RemoteSearchable class, I faced 
>>>>>>>>>> some
>>>>>>>>>>                     
>>>>>>>>> problems
>>>>>>>>>                   
>>>>>>>>>> cased by the current Lucene implementation. In following
>>>>>>>>>>                     
>>> I'll
>>>       
>>>>>>>>> try to
>>>>>>>>>                   
>>>>>>>>>> describe them, as well as the possible ways of their
>>>>>>>>>>                     
>>>>> solution, I
>>>>>           
>>>>>>>>>> identified. The most important question for me is, if 
>>>>>>>>>> these
>>>>>>>>>>                     
>>>>>>>>> changes
>>>>>>>>>                   
>>>>>>>>>> have a chance to be integrated in the coming Lucene
>>>>>>>>>>                     
>>> versions,
>>>       
>>>>>>>>> such
>>>>>>>>>                   
>>>>>>>>>> that remote searches would really become feasible. I 
>>>>>>>>>> would
>>>>>>>>>>                     
>>>>>>>>> appreciate
>>>>>>>>>                   
>>>>>>>>>> any feedback.
>>>>>>>>>>                     
>>>>>>>>> Same problem for me and I found some more issues which I
>>>>>>>>>                   
>>> explain
>>>       
>>>>>>>>> below:
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> The first problem concerns the construction of the
>>>>>>>>>>                     
>>>>>>>>> RemoteSearchable
>>>>>>>>>                   
>>>>>>>>>> object. .Net framework allows for both, server and 
>>>>>>>>>> client
>>>>>>>>>>                     
>>>>>>>>> activation
>>>>>>>>>                   
>>>>>>>>>> models of the remote objects. Currently, 
>>>>>>>>>> RemoteSearchable
>>>>>>>>>>                     
>>>>> class
>>>>>           
>>>>>>>>>> possesses only one constructor that requires knowledge 
>>>>>>>>>> of a
>>>>>>>>>>                     
>>>>>>>>> local
>>>>>>>>>                   
>>>>>>>>>> Searchable object:
>>>>>>>>>>
>>>>>>>>>> public RemoteSearchable(Lucene.Net.Search.Searchable 
>>>>>>>>>> local)
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> I just added a new constructor to RemoteSearchable public 
>>>>>>>>> RemoteSearchable(): base() { this.local = this.local; }
>>>>>>>>>
>>>>>>>>> not the fine method but for me it works so far.
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> Since this "local" object is located on the server,
>>>>>>>>>>                     
>>>>> knowledge of
>>>>>           
>>>>>>>>> the
>>>>>>>>>                   
>>>>>>>>>> server's index paths is needed for its creation. 
>>>>>>>>>> However,
>>>>>>>>>>                     
>>>>> there
>>>>>           
>>>>>>>>> are at
>>>>>>>>>                   
>>>>>>>>>> least some scenarios where only the server, but not the
>>>>>>>>>>                     
>>>>> client,
>>>>>           
>>>>>>>>> knows
>>>>>>>>>                   
>>>>>>>>>> where the indexes are stored on the server side. I think
>>>>>>>>>>                     
>>> this
>>>       
>>>>>>>>> problem
>>>>>>>>>                   
>>>>>>>>>> could be solved by extending RemoteSearchable class with 
>>>>>>>>>> a
>>>>>>>>>>                     
>>>>>>>>> standard
>>>>>>>>>                   
>>>>>>>>>> constructor that reads the names of the indexes to be
>>>>>>>>>>                     
>>>>> published
>>>>>           
>>>>>>>>> out of
>>>>>>>>>                   
>>>>>>>>>> a configuration file on the server side.
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> My "Server" now implements a Class which inherits directly
>>>>>>>>>                   
>>> from
>>>       
>>>>>>>>> Remote
>>>>>>>>> Searchable.
>>>>>>>>> in the parameterless constructor there I read the server
>>>>>>>>>                   
>>> sided
>>>       
>>>>>>>>> configfile which contains the index location , create a 
>>>>>>>>> new IndexReader and pass it as Argument to MyBase.New() 
>>>>>>>>> See sample below.
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> 2. Bug in Term construction
>>>>>>>>>>                     
>>>>>>>>> [snip]
>>>>>>>>>
>>>>>>>>> This whole chapter was very useful and I can commit
>>>>>>>>>                   
>>> everything
>>>       
>>>>>>>>> works
>>>>>>>>> fine from there on.
>>>>>>>>>
>>>>>>>>> But there is still a bug in FieldDocSortedHitQueue line
>>>>>>>>>                   
>>> 130 and
>>>       
>>>>>>>>> below:
>>>>>>>>> I figured out that the castings are not working when the
>>>>>>>>>                   
>>>>> system is
>>>>>           
>>>>>>>>> running in a non english globalization context.
>>>>>>>>> The String in docAFields[i] which might be for example
>>>>>>>>>                   
>>>>> 1.345678 is
>>>>>           
>>>>>>>>> casted to 1345678.0 since the decimal sign is
>>>>>>>>>                   
>>> misinterpreted in
>>>       
>>>>>>>>> German
>>>>>>>>> systems as it seems.
>>>>>>>>> So the casting results in an overflow.
>>>>>>>>>
>>>>>>>>> So I changed it as follows:
>>>>>>>>>
>>>>>>>>> case SortField.SCORE:
>>>>>>>>> float r1 = (float)Convert.ToSingle(docA.fields[i],
>>>>>>>>> System.Globalization.NumberFormatInfo.InvariantInfo ); 
>>>>>>>>> float r2 = (float)Convert.ToSingle(docA.fields[i],
>>>>>>>>> System.Globalization.NumberFormatInfo.InvariantInfo);
>>>>>>>>> if (r1 > r2)
>>>>>>>>> c = - 1;
>>>>>>>>> if (r1 < r2)
>>>>>>>>> c = 1;
>>>>>>>>> break;
>>>>>>>>>
>>>>>>>>> Same in line 172 and 174:
>>>>>>>>>
>>>>>>>>> float f1 = (float)Convert.ToSingle(docA.fields[i],
>>>>>>>>> System.Globalization.NumberFormatInfo.InvariantInfo);
>>>>>>>>> //UPGRADE_TODO: The equivalent in .NET for method 
>>>>>>>>> 'java.lang.Float.floatValue' may return a different value.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
> "ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?index='!DefaultContext
> WindowIndex'&keyword='jlca1043'" 
>   
>>>>>>>>> float f2 = (float)Convert.ToSingle(docB.fields[i],
>>>>>>>>> System.Globalization.NumberFormatInfo.InvariantInfo );
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> A tiny Client Server Solution now looks like this (Here in
>>>>>>>>>                   
>>>>> VB.NET)
>>>>>           
>>>>>>>>> SERVER:
>>>>>>>>> Public Class RemoteQuery
>>>>>>>>> Inherits RemoteSearchable
>>>>>>>>> Public Sub New()
>>>>>>>>> MyBase.New(New IndexSearcher("C:\lucene\index")) End Sub 
>>>>>>>>> Public Sub New(ByVal local As Searchable)
>>>>>>>>> MyBase.New(local)
>>>>>>>>> End Sub
>>>>>>>>>
>>>>>>>>> End Class
>>>>>>>>>
>>>>>>>>> Module Module1
>>>>>>>>> Public Sub Main(ByVal args As System.String()) Dim chnl As 
>>>>>>>>> New HttpChannel(8888) ChannelServices.RegisterChannel 
>>>>>>>>> (chnl, False) Dim indexName As System.String = Nothing 
>>>>>>>>> RemotingConfiguration.RegisterWellKnownServiceType
>>>>>>>>> (GetType(RemoteQuery),
>>>>>>>>> "Searchable", WellKnownObjectMode.Singleton)
>>>>>>>>> System.Console.ReadLine()
>>>>>>>>> End Sub
>>>>>>>>> End Module
>>>>>>>>> CLIENT
>>>>>>>>> Sub Main()
>>>>>>>>> Dim searchables As Lucene.Net.Search.Searchable() = New
>>>>>>>>> Lucene.Net.Search.Searchable() {LookupRemote()} Dim 
>>>>>>>>> searcher As Searcher = New MultiSearcher(searchables) Dim 
>>>>>>>>> sort As New Lucene.Net.Search.Sort
>>>>>>>>> sort.SetSort(Lucene.Net.Search.SortField.FIELD_SCORE)
>>>>>>>>> Dim query As Query = QueryParser.Parse("Harry", "body", 
>>>>>>>>> New
>>>>>>>>> StandardAnalyzer())
>>>>>>>>> Dim result As Hits = searcher.Search (query, sort) End Sub 
>>>>>>>>> Private Function LookupRemote() As
>>>>>>>>>                   
>>> Lucene.Net.Search.Searchable
>>>       
>>>>>>>>> Return CType(Activator.GetObject(GetType(
>>>>>>>>> Lucene.Net.Search.Searchable), " 
>>>>>>>>> http://192.168.8.7:8888/Searchable"),
>>>>>>>>> Lucene.Net.Search.Searchable) End Function
>>>>>>>>>
>>>>>>>>> Hope this helps you and anybody else how has problems with 
>>>>>>>>> remotesearch so far.
>>>>>>>>>
>>>>>>>>> BTW: this all refer
>>>>>>>>>                   
>
>
>   


RE: Remote searching with Lucene - forward progress

Posted by George Aroush <ge...@aroush.net>.
Hi Robert,

Sorry, but I am not understanding what's going on here.  What modification
you are referring to that "Elena" and you made?  Was there some private
email exchange?

In any case, one other option there is to provide remote searching with
Lucene.Net is to port the existing solution in 1.4 to 2.0 (or maybe even
1.9.1)  If you or some has the cycles and want to take on this task, let us
know and go for it.

Sorting works with MultiSearcher.  Make sure you are using the latest
release of 1.9.1 or 2.0 ("final" in both cases.)

I can't tell you much about Lucene.Net and WAN since I have not used it (I
don't have a need for it, yet.)  Since you say you have written a solution,
and it sounds like a good one, can you contribute it to ASP / Lucene.Net?
If you can do so, make sure you have the appropriate ASF copyright message
on each file, a README.TXT file, a sample / demo and if possible an NUnit
test for the code.

Regards,

-- George Aroush


-----Original Message-----
From: Robert Boulanger [mailto:robert@boulanger.at] 
Sent: Tuesday, January 02, 2007 6:19 PM
To: lucene-net-dev@incubator.apache.org
Subject: Re: Remote searching with Lucene - forward progress

Hi Jeff,

thanks for the update.
Here the status from my side so far:

I worked until I dropped the last message sucessfully with the modifications
Elena and I described before. I did nothing else since I waited and hoped
for any other progress from other sides, but wondered why the suggested
fixes never went into the releases of 1.9.
Anyhow, an other issue I  found is that the sorting seems not to work
correctly when using remote searching features. (And maybe when using
MultiSearcher in general) So it looks like each index is sorted, but not the
hits collection of the multisearcher itself.
But the major issue I found was, that remote searches over a WAN, means
Inernet or a VPN for example takes about 100 time so long as the same query
within a LAN. ( means 7 seconds instead of 0.07 secs). So I think the Lucene
Remote Query relays on heavy bidirectional Network Traffic, means not
transporting a lot of data, but a lot of single calls which makes it slow in
a WAN Environment.

Therefore I wrote my own Client Server Wrapper for this which does things in
a single call to each remote index, and which is possible now also again
with Lucene 1.3 if necessary.
I'm also able to do this in a cascading way, means each queryserver can be
configured to forward the query to other servers and they again, and so on,
and so on. hereby is ensured that endless loops are not possible (Server a
calls b which calls again a) and the API allows the passing of a parameter
which defines how deep (in the hierarchy of configured
servers)  the search should be forwarded. The end result again has correct
sorting. I also don't use any multisearchers here, just normal indexreaders.

The whole architecture has nothing to do with Lucene itself, except the fact
that Lucene is used for searching, but if anybody has interest in this, let
me know, I can build a template or example how to do this and post it
anywhere.

Cheers

Robert


Jeff Rodenburg schrieb:
> Hi Robert, et. al -
>
> No, I've not missed updating the list.  I've been a bit busy with 
> other things but have been working to resolve some serialization 
> issues that are down in the core of .Net Remoting.  The Lucene 2.0 
> codebase has been problematic inside of the remoting architecture.  
> Rather than continue to update the list with notifications about a 
> lack of progress, I've opted to attempt to address those issues and 
> make an announcement when I'd reached success.
>
> So, no news for now.
>
> thanks,
> jeff
>
> On 12/3/06, Robert Boulanger <ro...@boulanger.at> wrote:
>>
>> Hi Jeff,
>>
>> concerning the message thread below which I began in August this 
>> year, I wonder if there is any progress on your side so far.
>> Maybe I missed something in the mailinglist (what I expect), since I 
>> was busy with other stuff,  but the last note from you concerning 
>> remote search I find here was from september 13th.
>> So, since I'm on this topic again, I just want to know, whether you 
>> released anything in the past months what I'm just not seeing or if 
>> you are still on the issue you are describing in your last note.
>> thanks for replying
>>
>> best regards
>>
>> --Robert
>>
>>
>>
>> Jeff Rodenburg schrieb:
>> > An update on the Remote Searching project I'm bringing forward.  
>> > I've completed the base code for hand-off to the community.  I'm 
>> > presently working through a remoting/serialization issue that's 
>> > popped up
>> recently.
>> > This appears to be something new in the Lucene 2.0 release.  I'm
>> working
>> > through that issue now, but I haven no expectation of when that's 
>> > resolved.
>> >
>> > Rather than release a non-working system, I'm going to resolve this 
>> > problem first.  Once things are working appropriately, I'll send 
>> > out a release message.
>> >
>> > Thanks and if you have remoting experience and suggestions, feel
>> free to
>> > ping me.  :-)
>> >
>> > cheers,
>> > jeff r.
>> >
>> >
>> > On 9/7/06, Jeff Rodenburg <je...@gmail.com> wrote:
>> >>
>> >> All -
>> >>
>> >> Another update on the remote searching application code that's 
>> >> been mentioned in this thread.  I'm near completion of the entire 
>> >> collection of files that are needed for this project -- libraries, 
>> >> applications,
>> unit
>> >> tests, and documentation.  There's quite a bit to this, and thanks
>> for
>> >> everybody's patience as I assemble the code into something that's 
>> >> less than confusing.  There are several working pieces, so I'm 
>> >> packaging it for consumption.
>> >>
>> >> I expect to have this available sometime in the next few days,
>> barring
>> >> things like my life and regular job from getting in the way.  
>> >> Again, I'll share an announcement to the list when I've made the 
>> >> files available.
>> >>
>> >> Thanks,
>> >> jeff r.
>> >>
>> >>
>> >>
>> >> On 8/26/06, Jeff Rodenburg <je...@gmail.com> wrote:
>> >> >
>> >> > As promised, an update to the list.
>> >> >
>> >> > I have code ready for delivery, if I can get svn access to the
>> contrib
>> >> > section.  A request has been made for this but it's going 
>> >> > nowhere,
>> >> so I'm
>> >> > going to find another place to host the files.
>> >> >
>> >> > There's quite a bit of documentation behind this so I'm working 
>> >> > diligently to explain how this works.  If anyone has a place to
>> >> hold the
>> >> > code until the uber-powers at apache decide to grant me access, 
>> >> > we
>> >> would
>> >> > greatly appreciate the assistance.
>> >> >
>> >> > cheers,
>> >> > jeff r.
>> >> >
>> >> >
>> >> >
>> >> > On 8/23/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>> >> > >
>> >> > > Just a follow-up to everyone on this topic.  I received a lot 
>> >> > > of offlist mail about this, so this message has a rather wide
>> >> distribution.
>> >> > >
>> >> > > I'm in process of modifying the code for our distributed 
>> >> > > search components so that they're generic enough for general 
>> >> > > usage and
>> >> public
>> >> > > consumption.  This is taking a little of my time, but 
>> >> > > nonetheless
>> >> I expect
>> >> > > to complete it soon.
>> >> > >
>> >> > > As for distributing the code, it will be located in the 
>> >> > > contrib portion of the Lucene.Net repository at apache.org .  
>> >> > > There is
>> some
>> >> > > logistic work involved, but ideally this is moving forward.
>> >> > >
>> >> > > As soon as I have more information to relay, I'll pass it 
>> >> > > along
>> >> to the
>> >> > > list.
>> >> > >
>> >> > > cheers,
>> >> > > jeff r.
>> >> > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > On 8/21/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>> >> > > >
>> >> > > > Hello all -
>> >> > > >
>> >> > > > I've been watching this thread to follow the direction and
>> >> thought I
>> >> > > > might be able to offer some assistance.  I run a search 
>> >> > > > system
>> >> that involves
>> >> > > > 4 separate search servers -- 3 serving search objects via
>> >> RemoteSearchable,
>> >> > > > and a 4th that serves in an index updating role.
>> >> > > >
>> >> > > > The codebase for Lucene.Net provides all the library
>> routines one
>> >> > > > needs to provide distributed search capabilities, but does 
>> >> > > > not
>> >> provide
>> >> > > > facilities for distributed search operation -- nor should it.
>> >> The ideas
>> >> > > > presented here are certainly possible; I've implemented a
>> >> working operation
>> >> > > > without requiring the changes described here.  I'm confident 
>> >> > > > in
>> >> our
>> >> > > > implementation; for the calendar year, our 
>> >> > > > uptime/availability
>> >> of search
>> >> > > > services is 99.99%.  Our only outage was related to network 
>> >> > > > hardware, otherwise we're sitting solid at 100%.
>> >> > > >
>> >> > > > I've been authorized to provide our operational code for
>> >> distributed
>> >> > > > search under Lucene.Net to the community at large.  Some of 
>> >> > > > the
>> >> code
>> >> > > > is customized to our operation, but for the most part it's
>> >> rather generic.
>> >> > > > We started the project under Lucene v1.4.3, but the 
>> >> > > > operational aspect still applies under v1.9.
>> >> > > >
>> >> > > > The system consists of a LuceneServer, which provides
>> >> searchability
>> >> > > > against indexes as defined in XML configuration files.  In
>> >> addition, an
>> >> > > > IndexUpdateServer provides master index updating, 
>> >> > > > master/slave
>> >> index
>> >> > > > replication and automated index maintenance.  Integration 
>> >> > > > with
>> >> our web site
>> >> > > > ensures the index stays available, updated and current.
>> >> There's a great
>> >> > > > deal of applied knowledge and learned behavior of many of 
>> >> > > > the
>> >> underlying
>> >> > > > sub-system components that distributed search under 
>> >> > > > Lucene.Net
>> >> makes
>> >> > > > use of -- .Net remoting, garbage collection, etc.
>> >> > > >
>> >> > > > If anyone has interest, please reply.  Contributing this 
>> >> > > > code requires a little cleanup of our customization work, so 
>> >> > > > my
>> >> response may not
>> >> > > > be immediate but I would make efforts to release the code in
>> >> short order.
>> >> > > >
>> >> > > > thanks,
>> >> > > > jeff r.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On 8/19/06, Robert Boulanger < robert@boulanger.at> wrote:
>> >> > > > >
>> >> > > > > Hi Elena, hi Rest,
>> >> > > > >
>> >> > > > > > Dear All,
>> >> > > > > >
>> >> > > > > > The application I am working on is intended to make use 
>> >> > > > > > of
>> the
>> >> > > > > > distributed search capabilities of the Lucene library. 
>> While
>> >> > > > > trying to
>> >> > > > > > work with the Lucene's RemoteSearchable class, I faced 
>> >> > > > > > some
>> >> > > > > problems
>> >> > > > > > cased by the current Lucene implementation. In following
>> I'll
>> >> > > > > try to
>> >> > > > > > describe them, as well as the possible ways of their
>> >> solution, I
>> >> > > > > > identified. The most important question for me is, if 
>> >> > > > > > these
>> >> > > > > changes
>> >> > > > > > have a chance to be integrated in the coming Lucene
>> versions,
>> >> > > > > such
>> >> > > > > > that remote searches would really become feasible. I 
>> >> > > > > > would
>> >> > > > > appreciate
>> >> > > > > > any feedback.
>> >> > > > >
>> >> > > > > Same problem for me and I found some more issues which I
>> explain
>> >> > > > > below:
>> >> > > > >
>> >> > > > > >
>> >> > > > > > The first problem concerns the construction of the
>> >> > > > > RemoteSearchable
>> >> > > > > > object. .Net framework allows for both, server and 
>> >> > > > > > client
>> >> > > > > activation
>> >> > > > > > models of the remote objects. Currently, 
>> >> > > > > > RemoteSearchable
>> >> class
>> >> > > > > > possesses only one constructor that requires knowledge 
>> >> > > > > > of a
>> >> > > > > local
>> >> > > > > > Searchable object:
>> >> > > > > >
>> >> > > > > > public RemoteSearchable(Lucene.Net.Search.Searchable 
>> >> > > > > > local)
>> >> > > > > >
>> >> > > > > I just added a new constructor to RemoteSearchable public 
>> >> > > > > RemoteSearchable(): base() { this.local = this.local; }
>> >> > > > >
>> >> > > > > not the fine method but for me it works so far.
>> >> > > > >
>> >> > > > > > Since this "local" object is located on the server,
>> >> knowledge of
>> >> > > > > the
>> >> > > > > > server's index paths is needed for its creation. 
>> >> > > > > > However,
>> >> there
>> >> > > > > are at
>> >> > > > > > least some scenarios where only the server, but not the
>> >> client,
>> >> > > > > knows
>> >> > > > > > where the indexes are stored on the server side. I think
>> this
>> >> > > > > problem
>> >> > > > > > could be solved by extending RemoteSearchable class with 
>> >> > > > > > a
>> >> > > > > standard
>> >> > > > > > constructor that reads the names of the indexes to be
>> >> published
>> >> > > > > out of
>> >> > > > > > a configuration file on the server side.
>> >> > > > > >
>> >> > > > > My "Server" now implements a Class which inherits directly
>> from
>> >> > > > > Remote
>> >> > > > > Searchable.
>> >> > > > > in the parameterless constructor there I read the server
>> sided
>> >> > > > > configfile which contains the index location , create a 
>> >> > > > > new IndexReader and pass it as Argument to MyBase.New() 
>> >> > > > > See sample below.
>> >> > > > >
>> >> > > > > > 2. Bug in Term construction
>> >> > > > > [snip]
>> >> > > > >
>> >> > > > > This whole chapter was very useful and I can commit
>> everything
>> >> > > > > works
>> >> > > > > fine from there on.
>> >> > > > >
>> >> > > > > But there is still a bug in FieldDocSortedHitQueue line
>> 130 and
>> >> > > > > below:
>> >> > > > > I figured out that the castings are not working when the
>> >> system is
>> >> > > > > running in a non english globalization context.
>> >> > > > > The String in docAFields[i] which might be for example
>> >> 1.345678 is
>> >> > > > > casted to 1345678.0 since the decimal sign is
>> misinterpreted in
>> >> > > > > German
>> >> > > > > systems as it seems.
>> >> > > > > So the casting results in an overflow.
>> >> > > > >
>> >> > > > > So I changed it as follows:
>> >> > > > >
>> >> > > > > case SortField.SCORE:
>> >> > > > > float r1 = (float)Convert.ToSingle(docA.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo ); 
>> >> > > > > float r2 = (float)Convert.ToSingle(docA.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
>> >> > > > > if (r1 > r2)
>> >> > > > > c = - 1;
>> >> > > > > if (r1 < r2)
>> >> > > > > c = 1;
>> >> > > > > break;
>> >> > > > >
>> >> > > > > Same in line 172 and 174:
>> >> > > > >
>> >> > > > > float f1 = (float)Convert.ToSingle(docA.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
>> >> > > > > //UPGRADE_TODO: The equivalent in .NET for method 
>> >> > > > > 'java.lang.Float.floatValue' may return a different value.
>> >> > > > >
>> >> > > > >
>> >>
>>
"ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?index='!DefaultContext
WindowIndex'&keyword='jlca1043'" 
>>
>> >>
>> >> > > > > float f2 = (float)Convert.ToSingle(docB.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > A tiny Client Server Solution now looks like this (Here in
>> >> VB.NET)
>> >> > > > > SERVER:
>> >> > > > > Public Class RemoteQuery
>> >> > > > > Inherits RemoteSearchable
>> >> > > > > Public Sub New()
>> >> > > > > MyBase.New(New IndexSearcher("C:\lucene\index")) End Sub 
>> >> > > > > Public Sub New(ByVal local As Searchable)
>> >> > > > > MyBase.New(local)
>> >> > > > > End Sub
>> >> > > > >
>> >> > > > > End Class
>> >> > > > >
>> >> > > > > Module Module1
>> >> > > > > Public Sub Main(ByVal args As System.String()) Dim chnl As 
>> >> > > > > New HttpChannel(8888) ChannelServices.RegisterChannel 
>> >> > > > > (chnl, False) Dim indexName As System.String = Nothing 
>> >> > > > > RemotingConfiguration.RegisterWellKnownServiceType
>> >> > > > > (GetType(RemoteQuery),
>> >> > > > > "Searchable", WellKnownObjectMode.Singleton)
>> >> > > > > System.Console.ReadLine()
>> >> > > > > End Sub
>> >> > > > > End Module
>> >> > > > > CLIENT
>> >> > > > > Sub Main()
>> >> > > > > Dim searchables As Lucene.Net.Search.Searchable() = New
>> >> > > > > Lucene.Net.Search.Searchable() {LookupRemote()} Dim 
>> >> > > > > searcher As Searcher = New MultiSearcher(searchables) Dim 
>> >> > > > > sort As New Lucene.Net.Search.Sort
>> >> > > > > sort.SetSort(Lucene.Net.Search.SortField.FIELD_SCORE)
>> >> > > > > Dim query As Query = QueryParser.Parse("Harry", "body", 
>> >> > > > > New
>> >> > > > > StandardAnalyzer())
>> >> > > > > Dim result As Hits = searcher.Search (query, sort) End Sub 
>> >> > > > > Private Function LookupRemote() As
>> Lucene.Net.Search.Searchable
>> >> > > > > Return CType(Activator.GetObject(GetType(
>> >> > > > > Lucene.Net.Search.Searchable), " 
>> >> > > > > http://192.168.8.7:8888/Searchable"),
>> >> > > > > Lucene.Net.Search.Searchable) End Function
>> >> > > > >
>> >> > > > > Hope this helps you and anybody else how has problems with 
>> >> > > > > remotesearch so far.
>> >> > > > >
>> >> > > > > BTW: this all refer



Re: Remote searching with Lucene - forward progress

Posted by Robert Boulanger <ro...@boulanger.at>.
Hi Jeff,

thanks for the update.
Here the status from my side so far:

I worked until I dropped the last message sucessfully with the 
modifications Elena and I described before. I did nothing else since I 
waited and hoped for any other progress from other sides, but wondered 
why the suggested fixes never went into the releases of 1.9.
Anyhow, an other issue I  found is that the sorting seems not to work 
correctly when using remote searching features. (And maybe when using 
MultiSearcher in general) So it looks like each index is sorted, but not 
the hits collection of the multisearcher itself.
But the major issue I found was, that remote searches over a WAN, means 
Inernet or a VPN for example takes about 100 time so long as the same 
query within a LAN. ( means 7 seconds instead of 0.07 secs). So I think 
the Lucene Remote Query relays on heavy bidirectional Network Traffic, 
means not transporting a lot of data, but a lot of single calls which 
makes it slow in a WAN Environment.

Therefore I wrote my own Client Server Wrapper for this which does 
things in a single call to each remote index, and which is possible now 
also again with Lucene 1.3 if necessary.
I'm also able to do this in a cascading way, means each queryserver can 
be configured to forward the query to other servers and they again, and 
so on, and so on. hereby is ensured that endless loops are not possible 
(Server a calls b which calls again a) and the API allows the passing of 
a parameter which defines how deep (in the hierarchy of configured 
servers)  the search should be forwarded. The end result again has 
correct sorting. I also don't use any multisearchers here, just normal 
indexreaders.

The whole architecture has nothing to do with Lucene itself, except the 
fact that Lucene is used for searching, but if anybody has interest in 
this, let me know, I can build a template or example how to do this and 
post it anywhere.

Cheers

Robert


Jeff Rodenburg schrieb:
> Hi Robert, et. al -
>
> No, I've not missed updating the list.  I've been a bit busy with other
> things but have been working to resolve some serialization issues that 
> are
> down in the core of .Net Remoting.  The Lucene 2.0 codebase has been
> problematic inside of the remoting architecture.  Rather than continue to
> update the list with notifications about a lack of progress, I've 
> opted to
> attempt to address those issues and make an announcement when I'd reached
> success.
>
> So, no news for now.
>
> thanks,
> jeff
>
> On 12/3/06, Robert Boulanger <ro...@boulanger.at> wrote:
>>
>> Hi Jeff,
>>
>> concerning the message thread below which I began in August this year, I
>> wonder if there is any progress on your side so far.
>> Maybe I missed something in the mailinglist (what I expect), since I was
>> busy with other stuff,  but the last note from you concerning remote
>> search I find here was from september 13th.
>> So, since I'm on this topic again, I just want to know, whether you
>> released anything in the past months what I'm just not seeing or if you
>> are still on the issue you are describing in your last note.
>> thanks for replying
>>
>> best regards
>>
>> --Robert
>>
>>
>>
>> Jeff Rodenburg schrieb:
>> > An update on the Remote Searching project I'm bringing forward.  I've
>> > completed the base code for hand-off to the community.  I'm presently
>> > working through a remoting/serialization issue that's popped up
>> recently.
>> > This appears to be something new in the Lucene 2.0 release.  I'm 
>> working
>> > through that issue now, but I haven no expectation of when that's
>> > resolved.
>> >
>> > Rather than release a non-working system, I'm going to resolve this
>> > problem
>> > first.  Once things are working appropriately, I'll send out a release
>> > message.
>> >
>> > Thanks and if you have remoting experience and suggestions, feel 
>> free to
>> > ping me.  :-)
>> >
>> > cheers,
>> > jeff r.
>> >
>> >
>> > On 9/7/06, Jeff Rodenburg <je...@gmail.com> wrote:
>> >>
>> >> All -
>> >>
>> >> Another update on the remote searching application code that's been
>> >> mentioned in this thread.  I'm near completion of the entire
>> >> collection of
>> >> files that are needed for this project -- libraries, applications, 
>> unit
>> >> tests, and documentation.  There's quite a bit to this, and thanks 
>> for
>> >> everybody's patience as I assemble the code into something that's
>> >> less than
>> >> confusing.  There are several working pieces, so I'm packaging it for
>> >> consumption.
>> >>
>> >> I expect to have this available sometime in the next few days, 
>> barring
>> >> things like my life and regular job from getting in the way.  Again,
>> >> I'll
>> >> share an announcement to the list when I've made the files available.
>> >>
>> >> Thanks,
>> >> jeff r.
>> >>
>> >>
>> >>
>> >> On 8/26/06, Jeff Rodenburg <je...@gmail.com> wrote:
>> >> >
>> >> > As promised, an update to the list.
>> >> >
>> >> > I have code ready for delivery, if I can get svn access to the
>> contrib
>> >> > section.  A request has been made for this but it's going nowhere,
>> >> so I'm
>> >> > going to find another place to host the files.
>> >> >
>> >> > There's quite a bit of documentation behind this so I'm working
>> >> > diligently to explain how this works.  If anyone has a place to
>> >> hold the
>> >> > code until the uber-powers at apache decide to grant me access, we
>> >> would
>> >> > greatly appreciate the assistance.
>> >> >
>> >> > cheers,
>> >> > jeff r.
>> >> >
>> >> >
>> >> >
>> >> > On 8/23/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>> >> > >
>> >> > > Just a follow-up to everyone on this topic.  I received a lot of
>> >> > > offlist mail about this, so this message has a rather wide
>> >> distribution.
>> >> > >
>> >> > > I'm in process of modifying the code for our distributed search
>> >> > > components so that they're generic enough for general usage and
>> >> public
>> >> > > consumption.  This is taking a little of my time, but nonetheless
>> >> I expect
>> >> > > to complete it soon.
>> >> > >
>> >> > > As for distributing the code, it will be located in the contrib
>> >> > > portion of the Lucene.Net repository at apache.org .  There is 
>> some
>> >> > > logistic work involved, but ideally this is moving forward.
>> >> > >
>> >> > > As soon as I have more information to relay, I'll pass it along
>> >> to the
>> >> > > list.
>> >> > >
>> >> > > cheers,
>> >> > > jeff r.
>> >> > >
>> >> > >
>> >> > >
>> >> > >
>> >> > > On 8/21/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
>> >> > > >
>> >> > > > Hello all -
>> >> > > >
>> >> > > > I've been watching this thread to follow the direction and
>> >> thought I
>> >> > > > might be able to offer some assistance.  I run a search system
>> >> that involves
>> >> > > > 4 separate search servers -- 3 serving search objects via
>> >> RemoteSearchable,
>> >> > > > and a 4th that serves in an index updating role.
>> >> > > >
>> >> > > > The codebase for Lucene.Net provides all the library 
>> routines one
>> >> > > > needs to provide distributed search capabilities, but does not
>> >> provide
>> >> > > > facilities for distributed search operation -- nor should it.
>> >> The ideas
>> >> > > > presented here are certainly possible; I've implemented a
>> >> working operation
>> >> > > > without requiring the changes described here.  I'm confident in
>> >> our
>> >> > > > implementation; for the calendar year, our uptime/availability
>> >> of search
>> >> > > > services is 99.99%.  Our only outage was related to network
>> >> > > > hardware, otherwise we're sitting solid at 100%.
>> >> > > >
>> >> > > > I've been authorized to provide our operational code for
>> >> distributed
>> >> > > > search under Lucene.Net to the community at large.  Some of the
>> >> code
>> >> > > > is customized to our operation, but for the most part it's
>> >> rather generic.
>> >> > > > We started the project under Lucene v1.4.3, but the operational
>> >> > > > aspect still applies under v1.9.
>> >> > > >
>> >> > > > The system consists of a LuceneServer, which provides
>> >> searchability
>> >> > > > against indexes as defined in XML configuration files.  In
>> >> addition, an
>> >> > > > IndexUpdateServer provides master index updating, master/slave
>> >> index
>> >> > > > replication and automated index maintenance.  Integration with
>> >> our web site
>> >> > > > ensures the index stays available, updated and current.
>> >> There's a great
>> >> > > > deal of applied knowledge and learned behavior of many of the
>> >> underlying
>> >> > > > sub-system components that distributed search under Lucene.Net
>> >> makes
>> >> > > > use of -- .Net remoting, garbage collection, etc.
>> >> > > >
>> >> > > > If anyone has interest, please reply.  Contributing this code
>> >> > > > requires a little cleanup of our customization work, so my
>> >> response may not
>> >> > > > be immediate but I would make efforts to release the code in
>> >> short order.
>> >> > > >
>> >> > > > thanks,
>> >> > > > jeff r.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On 8/19/06, Robert Boulanger < robert@boulanger.at> wrote:
>> >> > > > >
>> >> > > > > Hi Elena, hi Rest,
>> >> > > > >
>> >> > > > > > Dear All,
>> >> > > > > >
>> >> > > > > > The application I am working on is intended to make use of
>> the
>> >> > > > > > distributed search capabilities of the Lucene library. 
>> While
>> >> > > > > trying to
>> >> > > > > > work with the Lucene's RemoteSearchable class, I faced some
>> >> > > > > problems
>> >> > > > > > cased by the current Lucene implementation. In following 
>> I'll
>> >> > > > > try to
>> >> > > > > > describe them, as well as the possible ways of their
>> >> solution, I
>> >> > > > > > identified. The most important question for me is, if these
>> >> > > > > changes
>> >> > > > > > have a chance to be integrated in the coming Lucene 
>> versions,
>> >> > > > > such
>> >> > > > > > that remote searches would really become feasible. I would
>> >> > > > > appreciate
>> >> > > > > > any feedback.
>> >> > > > >
>> >> > > > > Same problem for me and I found some more issues which I
>> explain
>> >> > > > > below:
>> >> > > > >
>> >> > > > > >
>> >> > > > > > The first problem concerns the construction of the
>> >> > > > > RemoteSearchable
>> >> > > > > > object. .Net framework allows for both, server and client
>> >> > > > > activation
>> >> > > > > > models of the remote objects. Currently, RemoteSearchable
>> >> class
>> >> > > > > > possesses only one constructor that requires knowledge of a
>> >> > > > > local
>> >> > > > > > Searchable object:
>> >> > > > > >
>> >> > > > > > public RemoteSearchable(Lucene.Net.Search.Searchable local)
>> >> > > > > >
>> >> > > > > I just added a new constructor to RemoteSearchable
>> >> > > > > public RemoteSearchable(): base()
>> >> > > > > {
>> >> > > > > this.local = this.local;
>> >> > > > > }
>> >> > > > >
>> >> > > > > not the fine method but for me it works so far.
>> >> > > > >
>> >> > > > > > Since this "local" object is located on the server,
>> >> knowledge of
>> >> > > > > the
>> >> > > > > > server's index paths is needed for its creation. However,
>> >> there
>> >> > > > > are at
>> >> > > > > > least some scenarios where only the server, but not the
>> >> client,
>> >> > > > > knows
>> >> > > > > > where the indexes are stored on the server side. I think 
>> this
>> >> > > > > problem
>> >> > > > > > could be solved by extending RemoteSearchable class with a
>> >> > > > > standard
>> >> > > > > > constructor that reads the names of the indexes to be
>> >> published
>> >> > > > > out of
>> >> > > > > > a configuration file on the server side.
>> >> > > > > >
>> >> > > > > My "Server" now implements a Class which inherits directly 
>> from
>> >> > > > > Remote
>> >> > > > > Searchable.
>> >> > > > > in the parameterless constructor there I read the server 
>> sided
>> >> > > > > configfile which contains the index location , create a new
>> >> > > > > IndexReader
>> >> > > > > and pass it as Argument to MyBase.New()
>> >> > > > > See sample below.
>> >> > > > >
>> >> > > > > > 2. Bug in Term construction
>> >> > > > > [snip]
>> >> > > > >
>> >> > > > > This whole chapter was very useful and I can commit 
>> everything
>> >> > > > > works
>> >> > > > > fine from there on.
>> >> > > > >
>> >> > > > > But there is still a bug in FieldDocSortedHitQueue line 
>> 130 and
>> >> > > > > below:
>> >> > > > > I figured out that the castings are not working when the
>> >> system is
>> >> > > > > running in a non english globalization context.
>> >> > > > > The String in docAFields[i] which might be for example
>> >> 1.345678 is
>> >> > > > > casted to 1345678.0 since the decimal sign is 
>> misinterpreted in
>> >> > > > > German
>> >> > > > > systems as it seems.
>> >> > > > > So the casting results in an overflow.
>> >> > > > >
>> >> > > > > So I changed it as follows:
>> >> > > > >
>> >> > > > > case SortField.SCORE:
>> >> > > > > float r1 = (float)Convert.ToSingle(docA.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
>> >> > > > > float r2 = (float)Convert.ToSingle(docA.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
>> >> > > > > if (r1 > r2)
>> >> > > > > c = - 1;
>> >> > > > > if (r1 < r2)
>> >> > > > > c = 1;
>> >> > > > > break;
>> >> > > > >
>> >> > > > > Same in line 172 and 174:
>> >> > > > >
>> >> > > > > float f1 = (float)Convert.ToSingle(docA.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
>> >> > > > > //UPGRADE_TODO: The equivalent in .NET for method
>> >> > > > > 'java.lang.Float.floatValue' may return a different value.
>> >> > > > >
>> >> > > > >
>> >>
>> "ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?index='!DefaultContextWindowIndex'&keyword='jlca1043'" 
>>
>> >>
>> >> > > > > float f2 = (float)Convert.ToSingle(docB.fields[i],
>> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
>> >> > > > >
>> >> > > > >
>> >> > > > >
>> >> > > > > A tiny Client Server Solution now looks like this (Here in
>> >> VB.NET)
>> >> > > > > SERVER:
>> >> > > > > Public Class RemoteQuery
>> >> > > > > Inherits RemoteSearchable
>> >> > > > > Public Sub New()
>> >> > > > > MyBase.New(New IndexSearcher("C:\lucene\index"))
>> >> > > > > End Sub
>> >> > > > > Public Sub New(ByVal local As Searchable)
>> >> > > > > MyBase.New(local)
>> >> > > > > End Sub
>> >> > > > >
>> >> > > > > End Class
>> >> > > > >
>> >> > > > > Module Module1
>> >> > > > > Public Sub Main(ByVal args As System.String())
>> >> > > > > Dim chnl As New HttpChannel(8888)
>> >> > > > > ChannelServices.RegisterChannel (chnl, False)
>> >> > > > > Dim indexName As System.String = Nothing
>> >> > > > > RemotingConfiguration.RegisterWellKnownServiceType
>> >> > > > > (GetType(RemoteQuery),
>> >> > > > > "Searchable", WellKnownObjectMode.Singleton)
>> >> > > > > System.Console.ReadLine()
>> >> > > > > End Sub
>> >> > > > > End Module
>> >> > > > > CLIENT
>> >> > > > > Sub Main()
>> >> > > > > Dim searchables As Lucene.Net.Search.Searchable() = New
>> >> > > > > Lucene.Net.Search.Searchable() {LookupRemote()}
>> >> > > > > Dim searcher As Searcher = New MultiSearcher(searchables)
>> >> > > > > Dim sort As New Lucene.Net.Search.Sort
>> >> > > > > sort.SetSort(Lucene.Net.Search.SortField.FIELD_SCORE)
>> >> > > > > Dim query As Query = QueryParser.Parse("Harry", "body", New
>> >> > > > > StandardAnalyzer())
>> >> > > > > Dim result As Hits = searcher.Search (query, sort)
>> >> > > > > End Sub
>> >> > > > > Private Function LookupRemote() As 
>> Lucene.Net.Search.Searchable
>> >> > > > > Return CType(Activator.GetObject(GetType(
>> >> > > > > Lucene.Net.Search.Searchable),
>> >> > > > > " http://192.168.8.7:8888/Searchable"),
>> >> > > > > Lucene.Net.Search.Searchable)
>> >> > > > > End Function
>> >> > > > >
>> >> > > > > Hope this helps you and anybody else how has problems with
>> >> > > > > remotesearch
>> >> > > > > so far.
>> >> > > > >
>> >> > > > > BTW: this all refer



Re: Remote searching with Lucene - forward progress

Posted by Jeff Rodenburg <je...@gmail.com>.
Hi Robert, et. al -

No, I've not missed updating the list.  I've been a bit busy with other
things but have been working to resolve some serialization issues that are
down in the core of .Net Remoting.  The Lucene 2.0 codebase has been
problematic inside of the remoting architecture.  Rather than continue to
update the list with notifications about a lack of progress, I've opted to
attempt to address those issues and make an announcement when I'd reached
success.

So, no news for now.

thanks,
jeff

On 12/3/06, Robert Boulanger <ro...@boulanger.at> wrote:
>
> Hi Jeff,
>
> concerning the message thread below which I began in August this year, I
> wonder if there is any progress on your side so far.
> Maybe I missed something in the mailinglist (what I expect), since I was
> busy with other stuff,  but the last note from you concerning remote
> search I find here was from september 13th.
> So, since I'm on this topic again, I just want to know, whether you
> released anything in the past months what I'm just not seeing or if you
> are still on the issue you are describing in your last note.
> thanks for replying
>
> best regards
>
> --Robert
>
>
>
> Jeff Rodenburg schrieb:
> > An update on the Remote Searching project I'm bringing forward.  I've
> > completed the base code for hand-off to the community.  I'm presently
> > working through a remoting/serialization issue that's popped up
> recently.
> > This appears to be something new in the Lucene 2.0 release.  I'm working
> > through that issue now, but I haven no expectation of when that's
> > resolved.
> >
> > Rather than release a non-working system, I'm going to resolve this
> > problem
> > first.  Once things are working appropriately, I'll send out a release
> > message.
> >
> > Thanks and if you have remoting experience and suggestions, feel free to
> > ping me.  :-)
> >
> > cheers,
> > jeff r.
> >
> >
> > On 9/7/06, Jeff Rodenburg <je...@gmail.com> wrote:
> >>
> >> All -
> >>
> >> Another update on the remote searching application code that's been
> >> mentioned in this thread.  I'm near completion of the entire
> >> collection of
> >> files that are needed for this project -- libraries, applications, unit
> >> tests, and documentation.  There's quite a bit to this, and thanks for
> >> everybody's patience as I assemble the code into something that's
> >> less than
> >> confusing.  There are several working pieces, so I'm packaging it for
> >> consumption.
> >>
> >> I expect to have this available sometime in the next few days, barring
> >> things like my life and regular job from getting in the way.  Again,
> >> I'll
> >> share an announcement to the list when I've made the files available.
> >>
> >> Thanks,
> >> jeff r.
> >>
> >>
> >>
> >> On 8/26/06, Jeff Rodenburg <je...@gmail.com> wrote:
> >> >
> >> > As promised, an update to the list.
> >> >
> >> > I have code ready for delivery, if I can get svn access to the
> contrib
> >> > section.  A request has been made for this but it's going nowhere,
> >> so I'm
> >> > going to find another place to host the files.
> >> >
> >> > There's quite a bit of documentation behind this so I'm working
> >> > diligently to explain how this works.  If anyone has a place to
> >> hold the
> >> > code until the uber-powers at apache decide to grant me access, we
> >> would
> >> > greatly appreciate the assistance.
> >> >
> >> > cheers,
> >> > jeff r.
> >> >
> >> >
> >> >
> >> > On 8/23/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
> >> > >
> >> > > Just a follow-up to everyone on this topic.  I received a lot of
> >> > > offlist mail about this, so this message has a rather wide
> >> distribution.
> >> > >
> >> > > I'm in process of modifying the code for our distributed search
> >> > > components so that they're generic enough for general usage and
> >> public
> >> > > consumption.  This is taking a little of my time, but nonetheless
> >> I expect
> >> > > to complete it soon.
> >> > >
> >> > > As for distributing the code, it will be located in the contrib
> >> > > portion of the Lucene.Net repository at apache.org .  There is some
> >> > > logistic work involved, but ideally this is moving forward.
> >> > >
> >> > > As soon as I have more information to relay, I'll pass it along
> >> to the
> >> > > list.
> >> > >
> >> > > cheers,
> >> > > jeff r.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > On 8/21/06, Jeff Rodenburg < jeff.rodenburg@gmail.com> wrote:
> >> > > >
> >> > > > Hello all -
> >> > > >
> >> > > > I've been watching this thread to follow the direction and
> >> thought I
> >> > > > might be able to offer some assistance.  I run a search system
> >> that involves
> >> > > > 4 separate search servers -- 3 serving search objects via
> >> RemoteSearchable,
> >> > > > and a 4th that serves in an index updating role.
> >> > > >
> >> > > > The codebase for Lucene.Net provides all the library routines one
> >> > > > needs to provide distributed search capabilities, but does not
> >> provide
> >> > > > facilities for distributed search operation -- nor should it.
> >> The ideas
> >> > > > presented here are certainly possible; I've implemented a
> >> working operation
> >> > > > without requiring the changes described here.  I'm confident in
> >> our
> >> > > > implementation; for the calendar year, our uptime/availability
> >> of search
> >> > > > services is 99.99%.  Our only outage was related to network
> >> > > > hardware, otherwise we're sitting solid at 100%.
> >> > > >
> >> > > > I've been authorized to provide our operational code for
> >> distributed
> >> > > > search under Lucene.Net to the community at large.  Some of the
> >> code
> >> > > > is customized to our operation, but for the most part it's
> >> rather generic.
> >> > > > We started the project under Lucene v1.4.3, but the operational
> >> > > > aspect still applies under v1.9.
> >> > > >
> >> > > > The system consists of a LuceneServer, which provides
> >> searchability
> >> > > > against indexes as defined in XML configuration files.  In
> >> addition, an
> >> > > > IndexUpdateServer provides master index updating, master/slave
> >> index
> >> > > > replication and automated index maintenance.  Integration with
> >> our web site
> >> > > > ensures the index stays available, updated and current.
> >> There's a great
> >> > > > deal of applied knowledge and learned behavior of many of the
> >> underlying
> >> > > > sub-system components that distributed search under Lucene.Net
> >> makes
> >> > > > use of -- .Net remoting, garbage collection, etc.
> >> > > >
> >> > > > If anyone has interest, please reply.  Contributing this code
> >> > > > requires a little cleanup of our customization work, so my
> >> response may not
> >> > > > be immediate but I would make efforts to release the code in
> >> short order.
> >> > > >
> >> > > > thanks,
> >> > > > jeff r.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On 8/19/06, Robert Boulanger < robert@boulanger.at> wrote:
> >> > > > >
> >> > > > > Hi Elena, hi Rest,
> >> > > > >
> >> > > > > > Dear All,
> >> > > > > >
> >> > > > > > The application I am working on is intended to make use of
> the
> >> > > > > > distributed search capabilities of the Lucene library. While
> >> > > > > trying to
> >> > > > > > work with the Lucene's RemoteSearchable class, I faced some
> >> > > > > problems
> >> > > > > > cased by the current Lucene implementation. In following I'll
> >> > > > > try to
> >> > > > > > describe them, as well as the possible ways of their
> >> solution, I
> >> > > > > > identified. The most important question for me is, if these
> >> > > > > changes
> >> > > > > > have a chance to be integrated in the coming Lucene versions,
> >> > > > > such
> >> > > > > > that remote searches would really become feasible. I would
> >> > > > > appreciate
> >> > > > > > any feedback.
> >> > > > >
> >> > > > > Same problem for me and I found some more issues which I
> explain
> >> > > > > below:
> >> > > > >
> >> > > > > >
> >> > > > > > The first problem concerns the construction of the
> >> > > > > RemoteSearchable
> >> > > > > > object. .Net framework allows for both, server and client
> >> > > > > activation
> >> > > > > > models of the remote objects. Currently, RemoteSearchable
> >> class
> >> > > > > > possesses only one constructor that requires knowledge of a
> >> > > > > local
> >> > > > > > Searchable object:
> >> > > > > >
> >> > > > > > public RemoteSearchable(Lucene.Net.Search.Searchable local)
> >> > > > > >
> >> > > > > I just added a new constructor to RemoteSearchable
> >> > > > > public RemoteSearchable(): base()
> >> > > > > {
> >> > > > > this.local = this.local;
> >> > > > > }
> >> > > > >
> >> > > > > not the fine method but for me it works so far.
> >> > > > >
> >> > > > > > Since this "local" object is located on the server,
> >> knowledge of
> >> > > > > the
> >> > > > > > server's index paths is needed for its creation. However,
> >> there
> >> > > > > are at
> >> > > > > > least some scenarios where only the server, but not the
> >> client,
> >> > > > > knows
> >> > > > > > where the indexes are stored on the server side. I think this
> >> > > > > problem
> >> > > > > > could be solved by extending RemoteSearchable class with a
> >> > > > > standard
> >> > > > > > constructor that reads the names of the indexes to be
> >> published
> >> > > > > out of
> >> > > > > > a configuration file on the server side.
> >> > > > > >
> >> > > > > My "Server" now implements a Class which inherits directly from
> >> > > > > Remote
> >> > > > > Searchable.
> >> > > > > in the parameterless constructor there I read the server sided
> >> > > > > configfile which contains the index location , create a new
> >> > > > > IndexReader
> >> > > > > and pass it as Argument to MyBase.New()
> >> > > > > See sample below.
> >> > > > >
> >> > > > > > 2. Bug in Term construction
> >> > > > > [snip]
> >> > > > >
> >> > > > > This whole chapter was very useful and I can commit everything
> >> > > > > works
> >> > > > > fine from there on.
> >> > > > >
> >> > > > > But there is still a bug in FieldDocSortedHitQueue line 130 and
> >> > > > > below:
> >> > > > > I figured out that the castings are not working when the
> >> system is
> >> > > > > running in a non english globalization context.
> >> > > > > The String in docAFields[i] which might be for example
> >> 1.345678 is
> >> > > > > casted to 1345678.0 since the decimal sign is misinterpreted in
> >> > > > > German
> >> > > > > systems as it seems.
> >> > > > > So the casting results in an overflow.
> >> > > > >
> >> > > > > So I changed it as follows:
> >> > > > >
> >> > > > > case SortField.SCORE:
> >> > > > > float r1 = (float)Convert.ToSingle(docA.fields[i],
> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
> >> > > > > float r2 = (float)Convert.ToSingle(docA.fields[i],
> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
> >> > > > > if (r1 > r2)
> >> > > > > c = - 1;
> >> > > > > if (r1 < r2)
> >> > > > > c = 1;
> >> > > > > break;
> >> > > > >
> >> > > > > Same in line 172 and 174:
> >> > > > >
> >> > > > > float f1 = (float)Convert.ToSingle(docA.fields[i],
> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo);
> >> > > > > //UPGRADE_TODO: The equivalent in .NET for method
> >> > > > > 'java.lang.Float.floatValue' may return a different value.
> >> > > > >
> >> > > > >
> >>
> "ms-help://MS.VSCC.v80/dv_commoner/local/redirect.htm?index='!DefaultContextWindowIndex'&keyword='jlca1043'"
> >>
> >> > > > > float f2 = (float)Convert.ToSingle(docB.fields[i],
> >> > > > > System.Globalization.NumberFormatInfo.InvariantInfo );
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > A tiny Client Server Solution now looks like this (Here in
> >> VB.NET)
> >> > > > > SERVER:
> >> > > > > Public Class RemoteQuery
> >> > > > > Inherits RemoteSearchable
> >> > > > > Public Sub New()
> >> > > > > MyBase.New(New IndexSearcher("C:\lucene\index"))
> >> > > > > End Sub
> >> > > > > Public Sub New(ByVal local As Searchable)
> >> > > > > MyBase.New(local)
> >> > > > > End Sub
> >> > > > >
> >> > > > > End Class
> >> > > > >
> >> > > > > Module Module1
> >> > > > > Public Sub Main(ByVal args As System.String())
> >> > > > > Dim chnl As New HttpChannel(8888)
> >> > > > > ChannelServices.RegisterChannel (chnl, False)
> >> > > > > Dim indexName As System.String = Nothing
> >> > > > > RemotingConfiguration.RegisterWellKnownServiceType
> >> > > > > (GetType(RemoteQuery),
> >> > > > > "Searchable", WellKnownObjectMode.Singleton)
> >> > > > > System.Console.ReadLine()
> >> > > > > End Sub
> >> > > > > End Module
> >> > > > > CLIENT
> >> > > > > Sub Main()
> >> > > > > Dim searchables As Lucene.Net.Search.Searchable() = New
> >> > > > > Lucene.Net.Search.Searchable() {LookupRemote()}
> >> > > > > Dim searcher As Searcher = New MultiSearcher(searchables)
> >> > > > > Dim sort As New Lucene.Net.Search.Sort
> >> > > > > sort.SetSort(Lucene.Net.Search.SortField.FIELD_SCORE)
> >> > > > > Dim query As Query = QueryParser.Parse("Harry", "body", New
> >> > > > > StandardAnalyzer())
> >> > > > > Dim result As Hits = searcher.Search (query, sort)
> >> > > > > End Sub
> >> > > > > Private Function LookupRemote() As Lucene.Net.Search.Searchable
> >> > > > > Return CType(Activator.GetObject(GetType(
> >> > > > > Lucene.Net.Search.Searchable),
> >> > > > > " http://192.168.8.7:8888/Searchable"),
> >> > > > > Lucene.Net.Search.Searchable)
> >> > > > > End Function
> >> > > > >
> >> > > > > Hope this helps you and anybody else how has problems with
> >> > > > > remotesearch
> >> > > > > so far.
> >> > > > >
> >> > > > > BTW: this all refers Version 1.9rc1
> >> > > > >
> >> > > > > --Robert Boulanger
> >> > > > >
> >> > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>
>
>