You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Niklas Langvig <ni...@globesoft.com> on 2013/03/15 16:10:30 UTC

solr cell

We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 documents), is there a good way to update solr with documents that are added to the file server and deleted from the file server?
In windows you could have a wmi script that would get noticed when a document has been removed or added and then do appropriate update in solr.

Can this be solved somehow?

Thanks
Niklas

Re: solr cell

Posted by Jack Krupansky <ja...@basetechnology.com>.
Take a look at ManifoldCF, whch has a file system crawler which can track 
changed files.

-- Jack Krupansky

-----Original Message----- 
From: Niklas Langvig
Sent: Friday, March 15, 2013 11:10 AM
To: solr-user@lucene.apache.org
Subject: solr cell

We have all our documents (doc, docx, pdf) on a linux file server (~8 000 
000 documents), is there a good way to update solr with documents that are 
added to the file server and deleted from the file server?
In windows you could have a wmi script that would get noticed when a 
document has been removed or added and then do appropriate update in solr.

Can this be solved somehow?

Thanks
Niklas 


Re: solr cell

Posted by Arcadius Ahouansou <ar...@menelic.com>.
Another options similar to this would be the new file system
WatchService available in java 7:
http://docs.oracle.com/javase/tutorial/essential/io/notification.html


Arcadius.

On 15 March 2013 15:22, Michael Della Bitta
<mi...@appinions.com> wrote:
> Niklas,
>
> In Linux, the API for watching for filesystem changes is called
> inotify. You'd need to write something to listen to those events and
> react accordingly.
>
> Here's a brief discussion about it:
> http://stackoverflow.com/questions/4062806/inotify-how-to-use-it-linux
>
>
> Michael Della Bitta
>
> ------------------------------------------------
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
>
> www.appinions.com
>
> Where Influence Isn’t a Game
>
>
> On Fri, Mar 15, 2013 at 11:10 AM, Niklas Langvig
> <ni...@globesoft.com> wrote:
>> We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 documents), is there a good way to update solr with documents that are added to the file server and deleted from the file server?
>> In windows you could have a wmi script that would get noticed when a document has been removed or added and then do appropriate update in solr.
>>
>> Can this be solved somehow?
>>
>> Thanks
>> Niklas

Re: solr cell

Posted by Michael Della Bitta <mi...@appinions.com>.
Niklas,

In Linux, the API for watching for filesystem changes is called
inotify. You'd need to write something to listen to those events and
react accordingly.

Here's a brief discussion about it:
http://stackoverflow.com/questions/4062806/inotify-how-to-use-it-linux


Michael Della Bitta

------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Fri, Mar 15, 2013 at 11:10 AM, Niklas Langvig
<ni...@globesoft.com> wrote:
> We have all our documents (doc, docx, pdf) on a linux file server (~8 000 000 documents), is there a good way to update solr with documents that are added to the file server and deleted from the file server?
> In windows you could have a wmi script that would get noticed when a document has been removed or added and then do appropriate update in solr.
>
> Can this be solved somehow?
>
> Thanks
> Niklas