You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Britt <bg...@cox.net> on 2012/11/12 23:19:15 UTC

Example for Scheduling Solr Indexing - Hadoop

Background
I have a file that gets dropped into a new directory every 10 minutes.
Examples:
/2012/11/05/HH/10/bigfile.txt
/2012/11/05/HH/20/bigfile.txt
/2012/11/05/HH/30/bigfile.txt
/2012/11/05/HH/40/bigfile.txt

I need to schedule a job to index these files every 10 minutes.
Examples:
/2012/11/05/HH/10/indexes/
/2012/11/05/HH/20/indexes/
/2012/11/05/HH/30/indexes/
/2012/11/05/HH/40/indexes/

Anyone have an example of how to do this?




--
View this message in context: http://lucene.472066.n3.nabble.com/Example-for-Scheduling-Solr-Indexing-Hadoop-tp4019862.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Example for Scheduling Solr Indexing - Hadoop

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

This could be as simple as writing an app that periodically checks the
appropriate directories, looks for any new files added since last checks,
and then reads/parses them (presumably there are data for records/documents
that need to be indexed in those files), constructs SolrInputDocuments, and
sends them to Solr via SolrJ... .if you want to use Java, that is.  Or, if
you want to get fancy (and overly complicated for this particular use
case), you could use Flume and its new SpoolDirectory together with the
soon to be written Flume Solr sink. :)

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Mon, Nov 12, 2012 at 5:19 PM, Britt <bg...@cox.net> wrote:

> Background
> I have a file that gets dropped into a new directory every 10 minutes.
> Examples:
> /2012/11/05/HH/10/bigfile.txt
> /2012/11/05/HH/20/bigfile.txt
> /2012/11/05/HH/30/bigfile.txt
> /2012/11/05/HH/40/bigfile.txt
>
> I need to schedule a job to index these files every 10 minutes.
> Examples:
> /2012/11/05/HH/10/indexes/
> /2012/11/05/HH/20/indexes/
> /2012/11/05/HH/30/indexes/
> /2012/11/05/HH/40/indexes/
>
> Anyone have an example of how to do this?
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Example-for-Scheduling-Solr-Indexing-Hadoop-tp4019862.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>