You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Álvaro Vargas Quezada <al...@outlook.com> on 2013/04/10 23:31:38 UTC
How to index Sharepoint files with Lucene
Hi everyone!
I'm trying to combine Lucene with Sharepoint (we use Windows and SP 2010), but I couldn't find good tutorials or proven tests cases that demostrate this integration. Do you know any tutorial or can give me some help about this?
I have read all the "Lucene in Action" but here just talk about indexing files, and not integration with other softwares.
Thanks in advanceGreetz from Chile
Re: How to index Sharepoint files with Lucene
Posted by Jack Krupansky <ja...@basetechnology.com>.
The Apache ManifoldCF "connector framework" has a SharePoint connector that
can crawl SharePoint repositories. It has an output connector that feeds
into Solr/SolrCell, but you can easily develop a connector that outputs
whatever you want - like put the crawled files into a file system directory,
or maybe even send each file directly into Tika and then directly index the
content into Lucene, if that's what you want. In any case, MCF handles the
SharePoint access and crawling.
See:
http://manifoldcf.apache.org/en_US/index.html
-- Jack Krupansky
-----Original Message-----
From: Álvaro Vargas Quezada
Sent: Wednesday, April 10, 2013 5:31 PM
To: java-user@lucene.apache.org
Subject: How to index Sharepoint files with Lucene
Hi everyone!
I'm trying to combine Lucene with Sharepoint (we use Windows and SP 2010),
but I couldn't find good tutorials or proven tests cases that demostrate
this integration. Do you know any tutorial or can give me some help about
this?
I have read all the "Lucene in Action" but here just talk about indexing
files, and not integration with other softwares.
Thanks in advanceGreetz from Chile
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org