You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Niklas Langvig <ni...@globesoft.com> on 2013/03/15 16:10:30 UTC
solr cell
We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 documents), is there a good way to update solr with documents that are added to the file server and deleted from the file server?
In windows you could have a wmi script that would get noticed when a document has been removed or added and then do appropriate update in solr.
Can this be solved somehow?
Thanks
Niklas
Re: solr cell
Posted by Jack Krupansky <ja...@basetechnology.com>.
Take a look at ManifoldCF, whch has a file system crawler which can track
changed files.
-- Jack Krupansky
-----Original Message-----
From: Niklas Langvig
Sent: Friday, March 15, 2013 11:10 AM
To: solr-user@lucene.apache.org
Subject: solr cell
We have all our documents (doc, docx, pdf) on a linux file server (~8 000
000 documents), is there a good way to update solr with documents that are
added to the file server and deleted from the file server?
In windows you could have a wmi script that would get noticed when a
document has been removed or added and then do appropriate update in solr.
Can this be solved somehow?
Thanks
Niklas
Re: solr cell
Posted by Arcadius Ahouansou <ar...@menelic.com>.
Another options similar to this would be the new file system
WatchService available in java 7:
http://docs.oracle.com/javase/tutorial/essential/io/notification.html
Arcadius.
On 15 March 2013 15:22, Michael Della Bitta
<mi...@appinions.com> wrote:
> Niklas,
>
> In Linux, the API for watching for filesystem changes is called
> inotify. You'd need to write something to listen to those events and
> react accordingly.
>
> Here's a brief discussion about it:
> http://stackoverflow.com/questions/4062806/inotify-how-to-use-it-linux
>
>
> Michael Della Bitta
>
> ------------------------------------------------
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
>
> www.appinions.com
>
> Where Influence Isn’t a Game
>
>
> On Fri, Mar 15, 2013 at 11:10 AM, Niklas Langvig
> <ni...@globesoft.com> wrote:
>> We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 documents), is there a good way to update solr with documents that are added to the file server and deleted from the file server?
>> In windows you could have a wmi script that would get noticed when a document has been removed or added and then do appropriate update in solr.
>>
>> Can this be solved somehow?
>>
>> Thanks
>> Niklas
Re: solr cell
Posted by Michael Della Bitta <mi...@appinions.com>.
Niklas,
In Linux, the API for watching for filesystem changes is called
inotify. You'd need to write something to listen to those events and
react accordingly.
Here's a brief discussion about it:
http://stackoverflow.com/questions/4062806/inotify-how-to-use-it-linux
Michael Della Bitta
------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271
www.appinions.com
Where Influence Isn’t a Game
On Fri, Mar 15, 2013 at 11:10 AM, Niklas Langvig
<ni...@globesoft.com> wrote:
> We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 documents), is there a good way to update solr with documents that are added to the file server and deleted from the file server?
> In windows you could have a wmi script that would get noticed when a document has been removed or added and then do appropriate update in solr.
>
> Can this be solved somehow?
>
> Thanks
> Niklas