You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Jorg Heymans <jo...@gmail.com> on 2009/12/08 13:02:38 UTC

Tika and DIH integration (https://issues.apache.org/jira/browse/SOLR-1358)

Hi,

I am looking into using Solr for indexing a large database that has
documents (mostly pdf and msoffice) stored as CLOBs in several tables.
It is my understanding that the DIH as provided in Solr 1.4 cannot
index these CLOBs yet, and that SOLR-1358 should provide exactly this.
So i was wondering what the most 'recommended' way is of solving this
.. Should it be done with a custom textextractor of some sort, set on
the column/field ?

Thanks,
Jorg

Re: Tika and DIH integration (https://issues.apache.org/jira/browse/SOLR-1358)

Posted by Noble Paul നോബിള്‍ नोब्ळ् <no...@corp.aol.com>.

we are very close to resolving SOLR-1358 . So you may be able to use it

On Tue, Dec 8, 2009 at 5:32 PM, Jorg Heymans <jo...@gmail.com> wrote:
> Hi,
>
> I am looking into using Solr for indexing a large database that has
> documents (mostly pdf and msoffice) stored as CLOBs in several tables.
> It is my understanding that the DIH as provided in Solr 1.4 cannot
> index these CLOBs yet, and that SOLR-1358 should provide exactly this.
> So i was wondering what the most 'recommended' way is of solving this
> .. Should it be done with a custom textextractor of some sort, set on
> the column/field ?
>
> Thanks,
> Jorg
>



-- 
-----------------------------------------------------
Noble Paul | Systems Architect| AOL | http://aol.com