You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by hadi <md...@gmail.com> on 2011/09/23 22:53:21 UTC
indexing FTP documet with solrj
I want to index some document with solrj API's but the URL of theses
documents is FTP,
How to set username and password for FTP acount in solrj
in solrj API there is CommonsHttpSolrServer method but i do not find any
method for FTP configuration
thanks
--
View this message in context: http://lucene.472066.n3.nabble.com/indexing-FTP-documet-with-solrj-tp3363025p3363025.html
Sent from the Solr - User mailing list archive at Nabble.com.
Re: indexing FTP documet with solrj
Posted by Marc SCHNEIDER <ma...@gmail.com>.
Hello,
To crawl the document you can use Apache Tika before sending the content to
Solr (via Solrj).
Regards,
Marc.
On Wed, Oct 5, 2011 at 1:16 AM, Chris Hostetter <ho...@fucit.org>wrote:
>
> : I want to index some document with solrj API's but the URL of theses
> : documents is FTP,
> : How to set username and password for FTP acount in solrj
> :
> : in solrj API there is CommonsHttpSolrServer method but i do not find any
> : method for FTP configuration
>
> it sounds like you are getting ocnfused between using SolrJ to talk to
> *solr* And using SolrJ to index arbitrary URLs.
>
> SolrJ doesn't do any crawling -- if you have data that you want to index
> then your client code needs to decide what that data is (and where it
> comes from) and feed that data to SolrJ as "documents" to index. the only
> URLs that SolrJ knows about are:
> * the URL for tlaking to Solr
> * "strings" that SolrJ passes to solr as document fields that may just so
> happen to be URLs (SolrJ doesn't know/care)
>
> -Hoss
>
Re: indexing FTP documet with solrj
Posted by Chris Hostetter <ho...@fucit.org>.
: I want to index some document with solrj API's but the URL of theses
: documents is FTP,
: How to set username and password for FTP acount in solrj
:
: in solrj API there is CommonsHttpSolrServer method but i do not find any
: method for FTP configuration
it sounds like you are getting ocnfused between using SolrJ to talk to
*solr* And using SolrJ to index arbitrary URLs.
SolrJ doesn't do any crawling -- if you have data that you want to index
then your client code needs to decide what that data is (and where it
comes from) and feed that data to SolrJ as "documents" to index. the only
URLs that SolrJ knows about are:
* the URL for tlaking to Solr
* "strings" that SolrJ passes to solr as document fields that may just so
happen to be URLs (SolrJ doesn't know/care)
-Hoss